Episode 124 - Breaking Up with Tech Debt: A Love Story with M. Scott Ford
The .NET Core Podcast
Episode 124 - Breaking Up with Tech Debt: A Love Story with M. Scott Ford
Supporting The Show
If this episode was interesting or useful to you, please consider supporting the show with one of the above options.
Software maintenance is an integral part of any software development project, but it can often be neglected, leading to a range of problems down the line. M. Scott Ford, the co-founder, chief code whisperer, and CTO of CorgyBytes , is passionate about helping teams make improvements to their existing software systems rather than throwing them away and starting from scratch. In this episode of The .NET Core Podcast, Ford and Jamie discussed the challenges of maintaining software and some of the tools they use to make it easier.
One of the tools they discussed was Freshli , a tool that visualizes tech debt and shows how difficult a codebase is to work with. It tracks the age of dependencies and can help identify potential risks before they become major problems. They also discussed the Equifax hack from 2017, which was caused by a single outdated dependency. This highlights the importance of keeping dependencies up to date and the potential risks of neglecting them.
The conversation also touched on the need for more spaces for developers to meet up and exchange ideas and feedback. Ford and Taylor believe that the real nuggets of information are shared in the conversations that happen between talks or at the pub afterwards, and that these sessions allow people to do that organically.
One key takeaway from the discussion was the idea of making small, incremental improvements to a project rather than trying to fix everything at once. This approach is similar to a sports team focusing on individual attributes to improve overall performance. It can be more manageable and less overwhelming for a team to make small changes rather than trying to tackle everything at once.
The podcast also highlighted the challenge of dependencies becoming out of date and the importance of tracking and updating them regularly to reduce risk. Ford and Taylor discussed the use of dependency freshness metrics, specifically the libyear metric, to track the age of dependencies and how it changes over time. This metric can be used to communicate the level of risk a development team is carrying to leadership.
Overall, the conversation in this episode sheds light on the importance of software maintenance and the tools available to make it easier. Outdated dependencies in software projects can have a significant impact on productivity and security, but it is often an invisible problem that goes unnoticed by leadership. By implementing tools like Freshli and tracking the age of dependencies, teams can reduce the potential risks and make software maintenance more manageable.
Hello everyone and welcome to THE .NET Core Podcast. An award-winning podcast where we reach into the core of the .NET technology stack and, with the help of the .NET community, present you with the information that you need in order to grok the many moving parts of one of the biggest cross-platform, multi-application frameworks on the planet.
I am your host, Jamie “GaProgMan” Taylor. In this episode, I talked with M. Scott Ford of the Legacy Code Rocks podcast and CorgiBytes about Freshli, libyear, and how you can visualise both your tech debt and how difficult a code base will be to work with.
Along the way, Scott teaches us about libyear, talks about the standard sawtoothed chart that most projects will have for their dependencies, and we discuss ways to attempt to get buy in from the decision makers about how and when to give us budget for fixing this problems. We also talk about the fact that the Equifax hack from 2017 was due to a single dependency being out of date.
So let’s sit back, open up a terminal, type in
dotnet new podcast and let the show begin.
So, Scott, thank you ever so much for being back on the show. I mean, it hasn’t been…
Yeah, thanks for having me.
Yeah, you’re very welcome. Very welcome. My goodness.
So, yeah. Ford the folks who didn’t maybe haven’t caught part one of our discussion . I was wondering perhaps, could you give folks a bit of an elevator pitch as to, “who Scott is and the wonderful work he does?”
Okay. Hopefully it’s wonderful.
But yeah. So I am the co founder, chief code, whisperer, and CTO of CorgyBytes. We focus on helping teams make improvements to software systems they already have with hopes that they won’t necessarily have to throw away the mess that they’re currently working on. So kind of help people dig their way out of messes that they’ve got themselves into. That shows up in a lot of different ways. It can be a legacy code project with lots of aging technologies, or it could be just something that was built in a hurry because there was a need to figure out whether or not the idea would work. And now it’s a mess and it needs to be maintained. And so getting some help cleaning that up.
A lot of that cleanup work that we do comes in the form of adding test coverage. That’s something we advocate for a lot. We also do reviews of people’s code bases and provide recommendations. We help hit on technical debt upgrade dependencies, which I think we’re going to dig into a little bit later, and pretty much any maintenance activity that you can think of on a software project that most developers don’t enjoy doing. We actually do; we genuinely enjoy doing that stuff, and it really started because I genuinely enjoy doing that stuff and have kind of built a community around it over at legacycode.rocks.
Hopefully. That’s elevator pitch, I think that might have been an elevator ride. Kind of a long elevator ride.
No, I like it. So one of the things that I wanted to quickly point out as well is obviously, Legacy Code Rocks. I mean, I’m a big fan, so people should definitely be listening to that, for sure.
Yeah, it’s fun. We’ve been doing that for the podcast for, I think, six years now. I’d have to do some math. Seven years? What year is it? And then I think we’re coming up on 150 episodes, kind of in the ballpark of that. So, yeah, it’s been a lot of fun.
That’s pretty cool, I have to say. Between our previous chat and this one - for Insider Baseball Talk, these are recorded way in advance. So there’s a number of months between our previous chat and this one, and I’ve actually gone ahead and read a book called Kill It With Fire , which I don’t know whether you know, but in the book Kill It With Fire, I was right at the end of it. And suddenly there’s a Legacy Code Rocks line in there. I’m like, “what!” The author uses the metaphor of if you keep having house parties, that metaphor for technical debt. Right. You can have a house party but never clean up. If you keep having the house parties, you can end up with a house full of trash. Right?
Yeah. The author, I’m blanking on her name, so I’m looking it up real quick.
Marianne. Oh, gosh.
Yeah. Marianne Bellotti. Right?
Yes. Thank you.
Yes. Marianne Belotti. So we’ve had her on the podcast, and she also gave a talk at one of the MenderCon conferences that we host. So we’ll be hosting that again, I don’t know if those come in before, come out before or after that happens again, but that’ll be in May this year.
Find information about that at mendercon.com.
But it’s online virtual conference focused on software maintenance and kind of everything around it. We run it as a dual track. Sit down and watch talks is kind of one half, and the other track is an open space. So if you’ve got an idea that you want to share with somebody else, if you want to get practice speaking, that can be a great forum for that. And the open space tends to be way more interactive, which is a little bit more ideal, I think, ford a virtual conference than the typical sit and stare your screen and wonder if anybody else is watching.
Nice. I like it. I feel like there should be more spaces for devs to sort of meet up and exchange ideas and net feedback on them. So I really like this idea.
Yeah. And we have a weekly virtual meetup that we host as well. It’s every Wednesday at 1:00 p.m. Eastern; and that can be a great space to chat with other folks who enjoy software maintenance work. We don’t usually have a defined speaker for that, but if somebody wanted to be a defined speaker, like we could certainly let them have the floor and kind of kind of run the show. It’s usually more of a facilitated discussion around a topic. Sometimes we don’t have a topic, it’s just kind of like checking to see if anybody has a pressing need or concern and then we kind of talk about that.
Great. No, I really do like that because I feel like one of the let me start again, right? We’ll leave that in, but let me start again.
In Tabs and Spaces , we started that up and obviously you were on it very recently. We started up as, “what if we could capture the lightning that is, the talks between the talks in the conference hall or the talks that happen at the pub afterwards or at the bar or whatever you want to call it.” And I feel like that’s where the real nuggets of information are shared, right? You’ll go to a talk, someone will get up and give a fantastic talk for 45 minutes to an hour and you’ll learn loads. And then maybe you’ll go up to the person who gave the talk and ask them a question. And that right there is where the real knowledge is shared because everything else is practice. You’re coming to that speaker with a question that they haven’t been able to prepare for and so they give you meet the real person in that moment. And I feel like these sessions that you’re organizing, this whole thing that you’re putting on allows people to do that organically, I feel.
And it sounds like y’all’s, motivation with Tabs and Spaces was very similar to our motivation with Legacy Code Rocks. My business partner, Andrea, who was the co host for well over half the episodes, that was her idea was to say, “hey, what if we recorded these awesome conversations we were having with people so that we could listen to them later?” And yeah, it’s been great.
Another thing that’s kind of come out of that is one question we ask everyone. It’s kind of the final question we ask everyone who’s on the show is, “what do you love about Legacy code?” And Andrea pulled together a talk a couple of years ago where she went through and kind of catalogued everyone’s answers that we had accumulated over all the episodes and gave a little bit of a report about that kind. Of categorizing the different responses we got and just kind of what does that say about the variability of the definition of the term, attitudes towards it? Or it’s a data set of people who actually were able to answer the question where I think there are many folks who when presented with that question be like, “nothing.”
Nice. I like that. Yeah, that’s a great idea for a talk because then that can spark a conversation about what the people in the room think and whether they would be interested in working specifically with legacy code, or indeed that gives them a chance to reflect on their own code they’re writing. Because my personal opinion is that as soon as my fingers are off the keyboard, it’s legacy code, right? I wrote it and I had a thought in the moment and by the time my fingers are away from the keyboard or if I’m using an assistive technology, by the time I’ve stopped talking - by the time I’ve stopped giving the computer instructions to then later compile, that thought is gone. Right? And so what is remaining on screen is lacking the context of what’s in my head. Right? So that’s my own personal definition of it.
And I feel like if more developers took a little bit more time to think with I’m worried about saying it, but maybe with compassion for the other developers who are going to read the code to try and come up with a way of sharing that knowledge. Maybe in a wiki or maybe in comments. Not the what but the why, because that’s my thing, right? It’s got to be the why net because I can see that
i increments every time we go around the loop. But why are we going around the loop and why are we going around this many times, right?
Yeah. And I think I really like that point that you bring up basically kind of having empathy for other developers, thinking about them, how what you’re creating is going to affect them. And at the same time, we just did a review for a code base where the team wants to be producing more quality software, where they do care about those things and they invest the time in that stuff. And their leadership has made it clear that that’s not a priority. One of the things that I hope is that the report that we’ve created can be used as evidence of the fact that this is the state that things can end up in. When you don’t make a lot of maintenance activities part of your just your normal way of working, you know, when it’s, when it’s an afterthought, when it’s something that can always be put off till later, then I think that is largely how messes accumulate and it’s not necessarily that never make a mess. I think we’re human, we’re going to make messes. And sometimes it makes a lot of sense to make a mess. There are really good reasons for it. It’s more making sure you also take the time to clean up the mess and think of that as part of the work and not something that’s extra to the work. It’s not a, “oh, we have to stop feature development so we can clean up the mess.” It’s more, “how can we integrate the two? How can we make sure we’re doing clean up work and feature development at the same time?”
Sure, I’ve always likened it to and it gets used a lot, but I always likened it to the Boy Scout rule. “Leave the camp ground a little bit better than when you found it,” right?
Yes. I think it’s a great metaphor.
Another one is kind of like little incremental improvements. There are stories of sports teams or sports groups just looking at all the things they could possibly improve, all the things that they can improve that might have an impact on where they want to be. And they focus on those one at a time, recognizing that they can’t fix everything all at once. But the focus on improvement, they measure how much the individual value improved and then how much kind of the bigger picture that they’re trying to achieve, how much that improved, and just make small, little incremental changes. And eventually those really add up. It doesn’t have to be everything done all at once or kind of boil the ocean, so to speak, especially on a project where there has been a fair amount of neglect for whatever reason; the problem can seem intractable, and the temptation can certainly be there to, “oh, we’ll do this right on the next project. Once we start this microservice that we’re thinking of extracting out, that’s where we’ll start doing the practices that we really want to be practising.”
I think it’s important to kind of recognize the systemic effect that generated the state of the system that you’re working with. And are you addressing those systemic issues? Because if you’re not, there will be organizational pressure to kind of put you back in that state.
Yeah, I agree, and I think it’s in Atomic Habits , one of the examples that James Clear gives is a basketball team, I want to say, and essentially talks exactly about what you were saying. They wanted to grow as a team, and instead of thinking of, “well, let’s win all the games and get to the playoffs and win the championship,” I want to say it was basketball, but I’m not sure what they did was like you said, they focused on the individual attributes of every single player. How can you be better at layups? How can you be better at passing? How can you be better at defending? How can you be better at taking free throws or whatever? I’m not sure. I’m probably using the wrong terms, right? There’s probably loads of basketball fans screaming at the phones and screaming at their car, “how can you get the wrong terms, Jamie?” Because I’ve never really been interested in that world. But, yeah, he talked about how making those small incremental changes using the techniques from Kaizen, which is this wonderful Japanese idea of continual improvement through tiny changes. By doing that, it has a compounding interest effect. Right. You get better at passing to other players, other players get better at receiving that pass, which means they are better at doing the next thing right and then it compounds from there.
Yeah, I’m not sure what more to say.
Sorry, of course. No, I totally get it.
So let’s talk about ways that people can do that then. So we touched last time we talked on this project called Freshli that you’ve started. So let’s talk a little bit about that and how that helps with legacy code and dependencies and such.
So one of the challenges that I’ve really noticed over and over again on project to project is that dependencies tend to get really out of date and it’s a largely invisible problem. I think there are some language communities where it’s more visible than others. So I think some of the npm tooling and the Node ecosystem does a decent job of letting you know just when you install the packages, how many of them are out of date or how many of them might have security vulnerabilities, that’s information you get at install time. And that’s pretty handy. There are other language communities where you kind of just don’t know, like in the NuGet ecosystem with NuGet, in the .NET ecosystem when you do .NET restore or if you’re working with net Get directly, you don’t get that information. You don’t get information about which packages are out of date and which ones have security vulnerabilities or anything like that.
So one of the things that I really wanted to do was really experiment with the idea that if information about just how out of date things are, whether or not behaviour would change as a result and kind of, kind of two aspects of that. One being, “is the team aware” and the other, “is their management aware” and and in addition to just providing a spot metric of this is how things are now, I wanted to show how things have changed over time. So I was really interested in seeing how the delta between the versions of the team is currently on and the latest available versions. Kind of how that grew over time. There is a dependency metric and it’s “dependency freshness” is the term. And there’s a paper forget the name of the paper. You can find the paper by going to libyear.com and then scrolling down to the bottom of the page and there’s like a little footnote for it. But I think that paper is the paper that introduced the term dependency freshness and it proposes a few different dependency freshness metrics ways to evaluate how far out of date a library is. And I think at least one of them looks at kind of assumptions around semantic versioning. So how many major releases behind are you? How many minor releases behind are you? How many patch releases behind are you? I think that’s pretty useful.
But the metric that I like the most and one that I’ve been experimenting with the most recently is libyear. And the way that’s computed is you take the release date of the version you’re currently using for a particular dependency, and you take the release date of the latest version and I would say the latest version with the highest version number, because sometimes the latest version could be the most recently released version could be 1.2. But 2.0 may have been released two months ago. Right? So 1.2 is technically, if you sort just by date, 1.2 looks newer than 2.0 if you sort by date, but 2.0 is the one to look at. So looking at that metric and checking to see how far out of date things are and looking at how that’s changed over time, and looking at it at a project wide level. So across all of the dependencies that a project is using, what is the average, what is the total.
Comparing libyear for a project, comparing project to project is a little challenging because it’s highly the number you get is highly dependent on the number of dependencies. So you’re kind of only comparing apples to apples if two projects have the same number of dependencies. But I think in general, probably around 100 as a total score for a project is probably okay. Hopefully that’s not if you only have ten dependencies, then 100 is really bad, right? Because each one is ten years out of date. But assuming you’ve got 100 dependencies, ten is not so bad because on average you’d have each package be six months out of date in order to get a ten, in that case.
I think that metric when graphed over time is really neat because the shape of the graph tends to be up and to the right. So teams start out when you kind of dig into the project’s history and look at the very first commit and you compute this metric on that commit, the project total libyear is pretty much zero. And then as time goes on, the libyear just slowly creeps up and creeps up and creeps up and creeps up. And every now and then there’ll be kind of a spike downwards where the team has kind of upgraded a major dependency. It tends to be around kind of a base framework upgrade is what I’ve noticed. So there’ll be, say, a Ruby on Rails web application looking at how that has increased. The libyear metric will kind of go up and to the right until the team upgrades the next version of Rails. And then it’ll drop maybe by a third, and then it’ll continue to go up to the right, and then it’ll drop, usually by the same delta that it did before, but it’s no longer a third, it’s now like a seventh. And then it continues to go up into the right. And there will be occasionally that dip down, but the team’s never getting back to it’s very rare for teams to get back to zero.
And if I were to kind of tell leadership and show them this graph and say, “this is a graph of how much risk your development team is kind of carrying while they’re working on your project,” I wish I could translate it to kind of an actual risk metric and put that in terms of risk, in terms of dollars. I think there’s a little bit more research needed to be able to do that because otherwise it’s kind of hand to wavy like, “oh, it’s risk.” But I do think it’s a pretty decent stand in ford that as a concept, I think for kind of communicating to someone who doesn’t understand the underlying metric and what it means. But that’s pretty useful.
I also like graphing the maximum value. I think that’s really interesting. So across all the packages that the project is depending on what’s the highest libyear value across all of them, and graphing how that’s changed over time. And usually this kind of stair-steps for the projects that I’ve graphed, where it’ll start out at zero and then it’ll kind of jump to one, and then it’ll jump to two and kind of stay there for a little while and the it’ll jump to three and say that for a while. And then it’ll jump to eight and just kind of stay there and the jump to twelve. But you end up with that kind of stair-step effect and very rarely does that number go down. But that number across the projects that I’ve looked at, that number goes down more often than others. So you’ll have a project that gets to like a libyear of three or four and might hang out there. So even though the, the total, the total metric is expanding and expanding, expanding, you know, it, it does show that the team is doing a pretty decent job at making sure that nothing is more than three years out of date. So that is a measure of up-to-date-ness, I guess. But it’s not something that the average team member is likely really happy with.
And then I think the, you know, the pain of outdated dependencies shows up in a couple different ways. I think one comes in terms of working with, working with the library and wanting to do certain things with it. I often find that I’ll go look up the documentation ford a library that I’m using, and I’ll find a method that I want to use in the documentation and then I go to use it in my project only to find out, no, the version that the project is on doesn’t have that method yet. So then I have to make a decision. Do I upgrade this library just to net that method? And what’s the impact of doing so? And there are a lot of projects where that upgrade is risky. Like doing that upgrade is risky, like the safety net that’s needed to keep dependencies up to date. If it’s not there, it can be difficult to build, it can be difficult to build the trust and the confidence that you can upgrade dependencies without breaking stuff. So that’s definitely a real challenge.
But the frustration shows up in that way of like, “I had a problem, I went searching for the answer to it, I found the answer, I came back to my problem only to learn that the answer I found I can’t use. So now I have to go do more searching and figure out like okay, before the library added the support, how are people solving it then?” And that can be really difficult to tease out. Finding Stack Overflow posts that are old enough, you know, kind of around the era of, you know, when the library was created or, you know, looking for blog posts of a particular, that were of a particular publication date, you know, that can be challenging. Looking through the issues list and seeing if somebody posted an issue and if there might have been a workaround. Is it something that you can backport to the version that you’re using? If it’s an open source library, is it possible to kind of copy that methods implementation into your project and get that capability? That’s a lot of work. All to do something that if you had the safety net in place where you knew you could upgrade a dependency safely and you would know whether or not anything is broken. Kind of the happy path for that is you upgrade the dependency and you just use the new method and that’s a disruption of a few minutes and you’re on your way. Whereas the other route is hours or days of trying to figure this out. And I think that is something that happens often for developers when they’re working with an older library. And I think that is part of why it’s invisible is it’s kind of baked in to people doing their work and leadership doesn’t see it. Leadership might see it in terms of things are taking longer than they used to. And I think this is one of the ways that things can take longer than the used to take is that you’re kind of fighting the tool in order to get the job done.
So I think that’s a big impact, a big invisible impact. And then you also have the security vulnerabilities which is another thing that’s a big impact and that’s something that I want to add better support for in Freshlii. I want to be able to, in addition to graphing libyear, I want to graph number of vulnerabilities of different type. So I’d like to see how the number of critical vulnerabilities has changed over time and how the number of major vulnerabilities has changed over time. I think this would be one where you could do a stacked area graph and see how that’s changed over time. And my fear is that it’s going to be kind of up and to the right as well, and that the number of critical vulnerabilities will kind of go up and up and up, and most teams won’t be upgrading to package versions that don’t have the vulnerability. And that is part of what really kind of got me interested in dependencies in the first place as a problem area. Equifax, I don’t know if you’re familiar with Equifax, I know you’re in the UK. I don’t know how distributed they are in terms of credit monitoring. I don’t know if it’s just a US thing or how global their reach is.
Yeah, I have no idea whether they make it over here and do a lot of work over here, but I think I know about the security issue you’re about to mention.
Yeah. In the US, they’re kind of one of the big three companies that your credit gets checked against to determine whether or not you’re going to get a credit card when you apply for one and what your limit might be. So they have a ton of information about you. They have information about what banks and financial institutions you have bank accounts at, what addresses you had utility accounts at. They have tons and tons of data about your financial history and how reliable you look as a customer who has to pay bills. And they were hacked, they had this breach. Tons and tons of data was released about people’s financial habits and financial data. I was swept up in that, my data was exposed. I got a settlement check in the mail from the settlement, from the lawsuits that came out of that, and it was $5. So that was my compensation for it.
And things like that happen all the time, no big deal. When I hear about a breach like that, I think like, “oh, it was probably someone on their team accidentally wrote a SQL injection vulnerability or something like that. Had they done penetration testing, they might have found it.” That’s the kind of thing that seems harder to me. Having security expertise on your team is a bigger challenge, and it’s one of those challenges that kind of smaller companies often struggle with. But that’s not the kind of breach - that wasn’t really the reason for their vulnerability.
It was they had one library that was out of date, one library, it was Apache Struts and their underlying web framework, so it was a big one. So just to say “one library” is a little flippant because there was probably a lot of probably half the dependencies they had in their dependency list all dependent on that. But there was a two month window between when the vulnerability was published and when they were hacked. So during that two month period, if they had taken action and done the upgrade, then people’s data wouldn’t have been, you know, wouldn’t have been exposed, at least not in that way. Right?
Not because of that vulnerability, you know, you know. There’s still, like, you know, with security it is a challenge because like, if you’re, you know, if you’re running an organization that looks like an appealing target, then people are going to pound on it and it’s hard to keep the determined out, but hopefully you’re at least making it hard.
And I think that is one case where it’s unfortunate that there was this package that had been patched. The patch was available for two months and the fact that there were exploits in the wild was talked about, and I don’t know what went on internally at the team. I do want to say that I come to problems like this with the assumption that everyone involved was doing their best given the constraints and the knowledge that they had. And again, kind of pointing back to this is an invisible problem. It is hard to notice, but I do think we can kind of learn a lesson from that incident. And you’ll kind of make a commitment that events like that are going to become really rare. Like, you know, that, that, that’s, you know, that’s that’s within our power. Making something like that extremely rare is within our power.
A Request To You All
If you’re enjoying this show, would you mind sharing it with a colleague? Check your podcatcher for a link to show notes, which has an embedded player within it and a transcription and all that stuff, and share that link with them. I’d really appreciate it if you could indeed share the show.
But if you’d like other ways to support it, you could:
- Leave a rating or review on your podcatcher of choice
- Head over to dotnetcore.show/review for ways to do that
- Consider buying the show a coffee
- The BuyMeACoffee link is available on each episode’s show notes page
- This is a one-off financial support option
- Become a patron
- This is a monthly subscription-based financial support option
- And a link to that is included on each episode’s show notes page as well
I would love it if you would share the show with a friend or colleague or leave a rating or review . The other options are completely up to you, and are not required at all to continue enjoying the show.
Anyway, let’s get back to it.
Yeah. How do we stop this from happening again? What do we put in place to stop it? And I feel like libyear is a great way to do that, and obviously Freshli is an implementation of libyear and some other stuff. So perhaps, and this is just conjecture here, if the studies on libyear had been around, and if Freshli had been around, somebody might have been able to go to the decision maker and say, “hey, we’re in a really tricky situation here, this could cause a problem.”
I do think a big part of it is, you know, leadership setting priorities and saying that, you know, “if a vulnerability is detected that it’s, it’s dealt with like that, you know, that becomes a high priority. Like yes, there are other things that are also high priority but those things become secondary to a security vulnerability that we’ve become aware of.” And then just adopting that as a mindset kind of setting that as an expectation from an organizational perspective is huge and can be a big shift for a lot of organizations because sometimes addressing security vulnerabilities can be disruptive. It can mean that the feature that you wanted to get out this month, that’s not happening this month; fixing the vulnerability had to happen instead. So the feature work got pushed by two weeks or a week or whatever long the impact was. But I think if you do adopt that mentality then you can start to get to that point.
There’s also the Martin Fowler has a blog article that talks “about when things are difficult, do them more often” . Like when things are painful, try to do them more often. So like dependency upgrades tend to be really painful. So if you do them more often and work on making it easier; kind of like we were talking about continuous improvement earlier, the thing that you can be trying to continually improve very well could be, “how painful is it to do an upgrade?” and try to make that easier each time you upgrade a dependency? Another metric is, “how much trust do you have in your automated testing or your manual testing?” whatever your QA strategy is, how much trust you have in that approach to find problems that might be introduced when you upgrade a dependency. I think measuring that trust over time and trying to figure out how to make an improvement there, that could go a long way.
But yeah, I do think leadership has a big role to play and I think a challenge with technical debt and kind of different contributors to technical debt and of course I kind of view dependency freshness as a technical debt measure. There are leaders that have become numb to hearing about it. People are complaining all the time about technical debt. Developers always want to, you know, you know, they always want time to refactor. This is something that I’ve heard from, you know, from leadership at organizations that have custom software systems. You know, if this is a complaint that I’ve heard often is that, you know, “teams are, are kind of continuously complaining that they don’t have enough time to do clean up work.” It’s kind of just been something that leadership starts to ignore. It’s kind of like the little beep-beep-beep when the car backs up. The person driving that doesn’t hear it any more after a little while they just tune that out. The person who happens to be walking by might hear it but that becomes a sound that does get tuned out. You get acclimated to it. And I think some of the ways that technical debt issues get brought up can have the same effect or can kind of fall victim to the same kind of cognitive effect where if the complaint is too repetitive that it can start to be ignored.
Yeah, and I fully agree with that because I feel like perhaps “technical debt” is a phrase that’s used way too much, right? “We need to upgrade the thing. We need to change the libraries. We need to do this, we need to do that because technical debt, because technical debt, because it will be more painful and take longer to do it later, we need to do it now.” And so I fully understand that in a project where you have agile or scrum or whatever and you’re focusing on that velocity, “how do we ship more features, more features, more quickly, get it out to customers, fix the bugs. That’s what’s important because that’s what they’re paying us to do.”
Whereas if you look at smartphone operating systems, you buy a brand new smartphone, on the day of release, you have three years, or sometimes five years if you’re going down the Apple route, I think. You have so many years of, “we will support this hardware by releasing security updates.” And then after that point, we’re just going to cut you off. Because the whole time that say Samsung or whoever, Motorola, Google, whatever, the whole time that they’re releasing updates, they’re having to the backport those updates into the version that’s on your phone, not the version that’s in developer. Right. And that takes engineering time. And so obviously the companies then have to then say, “hey, we’ve got X amount of engineering time. We’ve got so many people working with this is now legacy code and we’re not making any money from that.” So I totally understand why companies will go, “it’s not really worth it. It doesn’t matter right up until there’s a breach, right? Or right up until there’s an issue.”
And there’s something that Tanya Juncker talks about in both her book App Security, I think it was Alice and Bob Learn App Security, but also in a conversation she had with me about I can’t remember the phrase she uses, but it’s like a responsibility sheet. You point out someone, “hey, there is a problem here and if we don’t fix it in this amount of time, something bad will happen. I don’t know what the bad is, but something bad will happen.” And they say, “well, that’s not on our roadmap, so we’re not going to do it.” You then present them with a document that says, “I accept the risk that if I don’t give you the time to update this thing or perform this maintenance or do this thing, then it is my fault for not giving you that time.” Problem with that, of course, is that works if you’re external to that team, right? If you’re like a contractor or an external person, you can make that claim. You can say, “well, okay, I’ve told you you’re accepting responsibility.” But in a team, like using the sports analogy, right, if I said to you, “hey Scott, you’re going to play point guard and if somebody scores a basket,” can you tell I don’t know anything about basketball. “Somebody gets the ball, it’s your fault, not mine.” Whereas actually we’re supposed to be working as a team to stop that from happening. Right? And so it’s a great idea, but I feel like for internal teams it maybe doesn’t work so well.
But maybe we need something like that. Maybe we need to shake it up to say, “no, we as a team are not taking responsibility. We have told you and we’re signing it off. You’re signing off and saying you’ve taken responsibility so that when the breach happens, it’s on your head.”
Yeah, I could see that’s something that the team might be able to communicate to leadership. So if their priorities are being set externally to the team, and the team has done their best to communicate, that something’s important. And if they feel like they’re not being listened to, then I can see that being something that the team could try on their own and be like, “hey, we’ve tried to communicate this risk to you. It’s one that’s hard to quantify in terms of dollars, but it’s certainly there and we think you should really address it.” Yeah, that’s an interesting approach.
Perhaps these teams could use Freshli with its graphs and charts that could be like the visual representation.
Look, we’re getting further and further and that’s something that I would love to see.
So I want a wide variety of dependency measures because there are projects that have a variety of dependencies. I think it’s pretty common for a web project to especially if they’re kind of new. The back end technology is different from the front end technology and that’s a common pattern that you see or let’s say it’s a mono repo with ten microservices in it. Each one of those microservices might be used in a different language and I think it would be nice to see the collective effect of how each of those dependency manifest is being managed. So like the dependencies that are defined for Node, are they being managed better than the ones that are defined in .NET? Or are the dependencies for project A being managed better than the dependencies for project B? So being able to see kind of like at a portfolio level how well things are doing, that’s what I want to build up to.
So I want to have really rich support for lots of language communities. I want to have support for various dependency freshness measures instead of just libyear. I want to add some of the others. I want security vulnerability information to be in there so you can graph it too. And then I have plans to come up with some kind of composite score that kind of boils all that stuff down into a five star rating to say that your project is doing this. And then I want to also be able to make recommendations and say, “hey, if you focus just on this library and the libraries that would get dragged along with it, and you upgraded it just to the next X version or whatever, and you upgrade the libraries that would come along with it, then these metrics would be improved to this degree.” So maybe on that composite score it would take you from three stars to four stars or from one star to three stars or something like that. Otherwise plan on using lemons. So the Freshli icon has a lemon in it.
I think that would be really useful. So these are things that I plan on doing right now. The ambitions are a little bit greater than the implementation but as with many things that are built as labours of love, they take time.
Okay, so where do people go to learn out and learn about Freshli then?
Is it open source, is it closed source?
How does that work? Information is spread out a little bit. You can go to freshli.app and see kind of more marketing, marketing information.
There is freshli.io, which you can plug in the URL for a Git repository. And this will analyze the project using the older infrastructure. Not the latest infrastructure I built, but the older version of it. But still you’ll get a sense from the graphs that it generates.
You can also go to github.com/corgibytes/freshli and this is kind of a parent repository for all of the related projects. All of the components except for the website are closed source at the moment. I am considering taking the CLI closed source as well. I’ve gotten feedback from folks that it’s the CLI that they’d actually be willing to pay for. And so as a project that needs to be monetized in some form. I am considering taking the CLI private.
And then also behind the scenes, Freshli is generating software bill materials files. So SBOMs, which have been getting a little bit of buzz recently, but that’s essentially what the community specific programs are doing is they’re generating the SBOMs and then the Freshli CLI is using the data in the SBOM to figure everything else out and do the calculations. So the SBOM is kind of a common consistent file format for dependency information, so that way the core implementation doesn’t need to know about the details of those dependency ecosystems. So that’s kind of the dividing line.
And I did that, I Pivoted to that approach because the old infrastructure, I wrote app and C#. Everything in Freshli except for the Java specific stuff is written in C#. So the website C#, the command line is C#. And so I just wrote code to parse the Ruby dependency manifest file format for bundler. I wrote code to parse the composer file format for PHP. Wrote code to parse requirements.txt file. Wrote code to communicate with the Ruby Gems website and API to figure out what versions have been released when. Same thing for PyPy and the same thing for - I don’t know what the name of the repository is called for composer; it might just be composer, but wherever the PHP packages are kept. And also I had support for Perl and that worked pretty well.
And so that’s where that kind of got me started thinking. And then when I started to look at the Java ecosystem, and specifically Maven, and the POM.XML file, and how it can reference parent POM files, and those parent files can have an effect on the child. And you can have any number of chains of parents. So a parent can have a parent, and a parent can have a parent, and a parent can have a parent, and then you can have sub modules, so you can have subdirectories in each one of those as a POM file. And all of that gets kind of built, kind of gets brought in together. And version numbers can be defined in interpolated strings, and those interpolated string values can be defined in six different places. So I did not trust myself to be able to get that right. And it was one of those things of like, if I really want to get it right, then I need to use Maven itself. And so that’s kind of what has led to the pivot. It’s just kind of like, “okay, well, I’ll start with the hard one.” And so that’s why I started with Java and Maven.
Sure. So, quick side question, and I realize you haven’t got much time left for today, but that’s fine. Does this mean that you get a free SBOM out of the box if you use Freshli? Because obviously that’s a big thing.
You’ll get a bunch of them, actually, you’ll get historical SBOMs.
Because what Freshli is doing is it’s mining the historical information in the source code repository. So it walks all the way back to the date of the first commit. It finds the data, the first commit, and then it moves forward to the start of the next increment. And you can tell it what increment you want to use. The default is one month, but you could set it to two weeks, you could set it to three months, you could set it to five days, ten days, whatever. But it’ll go to the oldest commit, and then it’ll go forward in time from the oldest commit to that next starting point. So the delta between the oldest commit and kind of step one, if you think of the oldest commit as step zero, then the, the delta between step zero and step one, you know, might be it’s almost always a fraction of that increment size. And then it walks forward in time at that increment, and it computes libyear on that day. It computes libyear as it would have looked on that day. So if you hopped in your time machine and you went back to your project and you ran libyear on that day, then the value that we spit out is the value that you would have gotten.
We have to be careful about a couple of nuanced details with the way different package managers work, especially with how version numbers get resolved. So if you’ve got a range expression on your version numbers, then we have to make sure we only resolve to a version number that was published on that day. So if you’ve got your possible valid versions set up to like 2.0+, then we need to make sure that we don’t use 7.0 because that would match them on the day that we’re running the command. No, it needs to be what you would have gotten on the day you would have run the command back in time. And so we produce an SBOM at that point in time. So you get an SBOM for every interval that we’re looking at, and those all end up get collected in a
You end up getting those SBOMs ford free. The SBOM is how I plan on adding support for security information because there’s lots of really good tooling out that will take an SBOM that you’ve generated and will augment it with security vulnerability information. So that’s what I plan on doing. And then again, for that, I want to make sure that it’s only augmenting with security vulnerabilities that were published at that point in time. So I need to do some filtering for that. But yeah, that’s kind of the idea is you get the SBOM for each point in time and then the metrics are computed using that SBOM.
Excellent, excellent. So use Freshli, get your free SBOMs. Then you’re partially legally complain, right?
Yeah, I like the idea of having them historically as well because I think that’s neat. The file format…
Literally showing how it changes over time, right.
The file format that we’re using is the CycloneDX SBOM file format, and the CycloneDX website has like a tool catalog . It’s basically on a directory of different tools and libraries that recognize or generate that file format. And the OWASP CycloneDX team maintains lots of libraries that will generate an SBOM for different language ecosystems. And so the language agent for Java Maven, is using one of their tools. It’s the Maven plugin that generates a CycloneDX SBOM from Maven. And I ended up fixing a small bug that they had. And so that’s something else I kind of plan on doing as part of this effort is as I notice that different CycloneDX SBOM libraries for the different language communities, if they don’t have the support that I need, then I’ll be submitting pull requests to make this better.
Aces. I like it. I really do. Cool.
So, real quick, the Scott, where can people go to find out more about you and legacy code rocks and stuff like that?
Yeah, leagacycode.rocks is the best place to find out about Legacy Code Rocks. I don’t really have a good personal brand anywhere on the internet at this point. I recently launched a Mastodon server at toot.legacycode.rocks . So it’s toot.legacycode.rocks, and if somebody wants to kind of follow along from a social ecosystem, I’ve got an account there, they can follow me, mscottford at toot.legacycode.rocks , and then corgybytes.com to kind of follow what the company is up to.
Amazing. Well, thank you for being on the show again. Thanks, Scott.
Thanks for having me, I enjoyed it
It’s been a wonderful time. And I am going to go ahead this evening and I’m going to go and learn a little bit more about Freshli, throw a few of my apps at it, and then get really scared and run into you and say how much to fix it.
Seeing those graphs is always like, wow, really? Because this is something that I didn’t really touch on. But most technical debt metrics, if you don’t make a change, it doesn’t change. Right. Like complexity or duplication or code coverage. If the last time you touched the project, it was at, like, code coverage was at 50%, then you can know a year later with no commits, it’s still 50%. The dependency freshness metric continues to grow while you’re not making commits. It continues to get worse because the packages that you’re depending on continue to get updated. So the only way that it would be static is if every single library you were using it was just suddenly unsupported and there are no new versions, which is not likely.
Right. That’s kind of scary.
Yeah. Because we tend to think about like, “oh, I haven’t changed it. It’s stable.” Right? But I think this is an aspect of I think this is part of what makes these older projects more difficult to work on. Because let’s think of, like, a Ruby on Rails application, or let’s think of an ASP .NET application. Right. ASP net from five years ago. I can’t pull that version around in my head. Let’s say ASP .NET Core MVC 2.0 or whatever. Right. Like today, that would be painful to work with, but possible. Right. I don’t know if that’s going to be true three years from now. Right.
Or ASP .NET MVC targeting .NET Framework 4.8. That’s doable today. Right. It’s still supported. It’s doable. I don’t know if that’s still going to be true three years from now. So even though it might be an older technology and it might be okay to work with it today, even though it’s old, if you were to happen to, for some reason, stop looking at that code base and because you’re focused on something else for a while, and then, “oh, we have to make a change.” You come back to it, and now it’s suddenly that much older, and it’s like, “oh, wow, this is really challenging to work with.” And I think that’s a big part of it is like, okay, “what version of Visual Studio do I need to download to even open this project? File in the pre .NET Core days.” So that’s a challenge.
It is. Well, yeah, folks will have to go and check out Freshli and see how difficult the code is to work with.
And help out with .NET support. So if somebody wanted to start the language agent for .NET based on the language agent for Java, that would be crazy. Welcome.
Cool. I mean, all requests are welcome, right?
Awesome. Well, like I said earlier, Scott, thanks for being on the show again and for using up some of your afternoon. I realize you’re a very busy chap, so yeah, thank you for that.
Thank you very much.
That was my second interview with M. Scott Ford about dependencies. Be sure to check out the show notes for a bunch of links to some of the stuff that we covered, and full transcription of the interview. The show notes, as always, can be found at dotnetcore.show , and there will be a link directly to them in your podcatcher.
And don’t forget to spread the word, leave a rating or review on your podcatcher of choice - head over to dotnetcore.show/review for ways to do that - reach out via out contact page , and to come back next time for more .NET goodness.
I will see you again real soon. See you later folks.
- The .NET Core Podcast Discord Server
- Part one of my discussion with M. Scott Ford
- Legacy Code Rocks
- Measuring Dependency Freshness in Software Systems
- FrequencyReducesDifficulty by Martin Fowler
- Software bill materials
- Ruby Gems