S08E12 - Jody Donetti on Creating FusionCache and Collaborating with Microsoft on HybridCache

Embedded Player

The Modern .NET Show

S08E12 - Jody Donetti on Creating FusionCache and Collaborating with Microsoft on HybridCache

Supporting The Show

If this episode was interesting or useful to you, please consider supporting the show with one of the above options.

Episode Summary

Jody Donetti discussed his open-source project, FusionCache, a hybrid caching library designed to address the pain points he encountered over decades of building applications. He explained the project stemmed from a desire to share learnings and provide a robust, well-documented solution for improving performance and resilience. Inspired by a Stack Overflow blog post validating his caching ideas, Jody took the leap into open-source development, ultimately receiving recognition through awards and, more importantly, positive feedback from a growing community of users who’ve found value in his work.

The conversation delved into the core concepts of caching, differentiating between browser-side caching for static content and server-side caching for dynamic data. Jody articulated caching as a trade-off: utilising resources (like memory) to store data locally, thereby reducing slower access to the original data source (like a database). He highlighted the importance of considering both the benefits and drawbacks, especially regarding data staleness and the need for effective invalidation strategies. He then expanded on different types of caches—memory caching offering speed but being volatile, and distributed caching providing persistence and scalability, with hybrid approaches aiming to combine the best of both worlds.

A key focus was the evolution from simple memory caching to more complex multi-level caching architectures. Jody explained how a hybrid cache, such as FusionCache, operates with a fast in-memory layer and a slower, shared distributed layer (like Redis). This setup addresses problems like application restarts and horizontal scaling, where multiple instances need access to a consistent cache. He emphasised the importance of utilising the right caching methods, and how FusionCache handles ‘cache stampede’ - preventing a rush of requests hitting the database when cached data expires - through features like ‘GetOrCreate’ methods.

Jody stressed that FusionCache isn’t intended to replace existing caching solutions like Redis but rather to work alongside them. He highlighted the value of abstraction, noting that HybridCache (from Microsoft) provides a common interface that allows FusionCache and other libraries to integrate seamlessly. He also championed the open-source .NET community, mentioning other excellent caching libraries such as CacheTower, CacheManager and EasyCaching, and directing listeners to a comparison within the FusionCache documentation to help them choose the right tool for the job.

Finally, Jody underscored the importance of clear documentation and a user-friendly approach to learning, explaining he invests significant effort in making the FusionCache documentation accessible, even including hand-drawn illustrations. He encouraged developers to embrace open source, experiment with different solutions, and share their knowledge, reinforcing the collaborative nature of the .NET development ecosystem and the rewarding feeling of contributing something valuable to the community. He also mentioned a second course on Dometrain is coming soon, expanding on the getting-started course currently available.

Episode Transcription

The idea is that you get you have some sort of source. It’s called usually the single source of truth, which is usually a database. In the case of web caching is the remote server that is the authoritative uh source of truth.
- Jody Donetti

Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I’m your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem.

Today, we’re joined by Jody Donetti to talk about FusionCache, caching in general, and what in-memory, distributed, and hybrid caching are. Note: hybrid caching ins’t the same as the Microsoft library HybridCache.

That’s the first problem. The second problem is that by using a distributed cache directly, you pay the price of network calls and deserialization every single cache call that you make.
- Jody Donetti

Along the way, we talked about open source development, how Jody got started with working in the open, and that listeners should never be scared of working in the open. If you’re building something for fun or to learn (rather than to give back or create the next big open source library), then let people know in the readme.

So let’s sit back, open up a terminal, type in dotnet new podcast and we’ll dive into the core of Modern .NET.

Jamie : Jody, welcome to the show. It is an absolute pleasure to talk with you today. We’ve crossed paths a few times, I think, at things like MVP Summit.

Jody : Yeah.

Jamie : But I don’t think I’ve ever actually had the chance to stop and chat with you yet. So this is really my first proper interaction with you. Welcome to the show. It’s a real pleasure to have you.

Jody : Thank you very much for inviting me. Yeah, for everyone not knowing me, which is basically most of the people that will listen to this, I’m Jody Donetti. I’m a principal engineer and I’ve been doing coding, R&D, architecture, stuff like that for a good part of 30 years right now. So yeah, a bit.

One aspect that I specialised in along the years is caching. Some time ago, I released in 2020, actually at the very end of 2020, I released FusionCache, my very first open source project, which is a hybrid caching library with some pretty advanced features, mostly about resiliency and performance improvements. So yeah, that’s me.

Jamie : Cool. Yeah, we’re going to talk a little bit about caching and things like that, especially about FusionCache in a moment. But I also know, before I even say congratulations on this, you have a course out at the moment with Dometrain, I believe. I wonder, could you talk folks through that and why they should take the course, I guess?

Jody : Good question. Yeah, so some time ago Nick proposed me this course. At first I was honestly like, "mmm, I don’t know, a course really? People should listen to me mumbling about caching stuff." But then it clicked in a way and I decided to pour all myself into it.

Basically this is the first course and there will be a second one out, hopefully soon. I’m working on it like day and night, but it’s a lot of work. It’s getting started and deep dive on caching in .NET. The idea is to cover from the ground up, from the very beginning, what caching is, what caching means, the pros and the cons, which is important to know, the various best practices. But more than that, all the building blocks, the little concepts that you can then use in your day-to-day projects, work or your open source project or whatever that is, to get it in a better place. Because with caching there are a lot of advantages that you can take out of it by using it. But you also need to know where not to use it and how not to use it. So I try to cover all of these in the course.

In the getting started course, I start from the very beginning. You don’t have to know anything except for, you know, a couple of things about C#, how it works. But apart from that, nothing. Even though it’s called getting started, I already go a bit advanced in it. But I try to do it both in the getting started and in the deep dive that I’m working on right now in a very approachable way, hopefully, if I’ve done my job well, so that anyone can jump into it and learn something new about caching.

I don’t cover, by the way, only FusionCache, which of course I do, but also caching in general and other caching libraries, like the new library HybridCache in .NET, which came out with .NET 9, and an overview of others in the open source community, which honestly is an amazing place, I think. The .NET open source community is full of rich projects that everybody should know, I think. Not just about caching, but anything.

So yeah, it’s out. The getting started course, the first one, has been out for a bit over a month right now. I have to say it’s going well. Some people completed it and I already received some reviews which were honestly good, so I’m happy about it. Some people even took some time to leave comments, explaining what they liked, what sometimes maybe not. Overall, I’m very happy about it. Yeah.

Jamie : Cool. Folks can get hold of that by going to the Dometrain website. I know I’m going to have a link in the show notes. You’re probably best, folks, to use the link in the show notes because that has a reference code so that we can track it back to Jody sharing it with folks. So use the link in the show notes, but I guess if people Google for getting started with caching in .NET, it’ll come up, right?

Jody : Yeah. Yeah.

Jamie : How long’s the course? Is it like a couple of hours or…

Jody : There are a lot of things I wanted to cover. So at first the idea was to do a course, like singular. Then I started putting out, you know, the structure of the course, which chapters and lessons and this and that, and it immediately became obvious that it was too much stuff that I wanted to cover. But I want this to be the one-stop place to get familiar with it and then get really proficient with even pretty advanced topics like distributed caching, multi-level caching, notifications and stuff like that. So I wanted to cover it all.

To do that, it became pretty long. So I had to split it up into two: getting started and deep dive. Now the getting started courses in general are about four, five, maybe six hours. Mine is seven hours and a half and fifteen minutes, something like that. The second one, I don’t know, because I’m still working on it as we talk. Yeah, I don’t know, probably a bit longer. But again, the idea is that if you take these courses from beginning to end, you’ve been able to see everything you need to know about caching in general, but in particular in .NET, because of course some concepts will be very general. You can use it even in other languages or whatever. But in particular in .NET, we’ll see libraries, patterns that you can use, things related to how the language works in relation to how do you use a cache library. So I would like to cover it all and that takes time.

Jamie : That makes sense, right? Like you said, there’s a lot to learn. But also that means that folks who pay for it are getting a lot of value for money, right? Because there’s a whole bunch of knowledge in there.

Jody : Yeah, that’s the whole idea. I wanted people to feel the value in what they bought, both if they bought the course as a single purchase, like only the course with lifetime access, which is something that can be done in the first, right now the thing is the first seventy-two hours when the course launches, people can buy it and have it forever.

Then there’s the all-you-can-eat buffet, like the Netflix style. You pay a monthly subscription or three-month subscription, something like that, and then you can watch any course on the platform, which is actually pretty great for the value that you get back. But anyway, yeah, that’s the point.

The way I try to structure it is to be, again, approachable. I don’t want people to have a problem getting to understand a new concept or a technique or a specific option, a feature, something like that. So I want this to be very, very approachable and that people can come back to later when, you know, they may look, at least that’s what happened to me sometimes with some videos online. I watched the video the first time to basically get introduced to a concept. Then time went by and I have not used it. Then I start needing it and so I get back to it later to make myself clearer on the subject, you know. So that’s the whole thing in my opinion. Having a place where you can go back to even later and re-watch some lessons, some chapters to make it click, if you will.

Jamie : Yeah, that’s pretty cool. Because one of the things I don’t really like about book learning is that I feel like sometimes I’m locked to specifically what’s on the page. Now obviously in a video course you’re locked to what’s on the video course, but I feel like it’s much easier in a video format to have those links between previous lessons and next lessons. Hey, we’re going to talk about this soon, but when you get to that lesson, come back here and remind yourself of it.

So I’m currently reading through Richter’s CLR via C#, just to get a better understanding of how the CLR works and intermediate language. I would say I’m about an eighth of the way through. It’s an eight hundred page book and I’m a hundred and something pages in. I’m like, this doesn’t make sense yet, but I’m hoping it will soon. I feel like that tends to happen more with book learning than it does with videos, right?

Usually I find that with videos, maybe because the video creator can almost see all of the content, you know, that you can jump backwards and forwards and go, oh, cool, we covered that in that first section, so I can tell the person to go back there. I can maybe just include it again, right? It’s a little easier, I think, to edit video content than it is to edit a book, because otherwise loads of people would be book editors. It’s way easier, I suppose, to see that content when you are editing through a video course. Then you can provide the supplemental content like here’s a cheat sheet or whatever, right? You can’t easily do that in a book.

Jody : Yeah, but you touched on a very important part, which is taking away a lot of time actually, at least for me. I’m a newbie related to educational material creation. But I’m taking this, hopefully, seriously, meaning I really try to play it and see how it will feel from the eyes of the people watching it in the future.

What I mean by that is it’s not just, you know, taking everything you know and dropping it on a screen with some videos, because that is like all the information will be there, but they will not be digestible. They will not be understandable, okay? So a lot of the time I spent creating the first course and now the second one is not just writing down the things that I want to talk about, but also restructuring it, restructuring the information in a way that is, again, digestible.

For example, you spend a lot of time writing on resiliency. But then I think, hmm, wait a second. But I’m mentioning this and this and that. I’ve not yet covered those things. So it’s better to cover them before so that it makes more sense as you watch it through, as you know, as you go through it. But then those are different connections. So it’s a lot of work in reorganising the information and creating like a storytelling in a way that makes sense.

The more you go on, the more you have something new that you’ve learned. So it keeps being interesting and, let’s say, nice to watch, but also you get to learn something new that the next day you can go to the office or whatever and use it right away. That’s something I strive for. I don’t know if I’ve been able to do it. Let’s hope.

Jamie : Well, I have every confidence in you, Jody. Having seen some of your talks like the NDC one that you did recently, I have every confidence that you have put something together that fits along a curriculum style lesson plan. So yeah, I absolutely fully appreciate that you will have done that. So I’ve full confidence.

Jody : Yeah, in general I can say that if you already took a look at the FusionCache documentation, which the community seems to particularly enjoy, that’s like an expanded version of that, meaning I will go broader and more deeper into every possible aspect of caching from a 360-degree point of view with the same style. So hopefully, again, approachable, you know, with illustrations and drawings. So it takes a lot of time to do that too. But yeah, that’s the way I do it. So yeah.

Jamie : Cool. Okay. Well, folks know they’ve got to go check this out. Yeah, they should definitely check it out because otherwise they’re not going to learn about caching. But we’re going to cover a little bit of it today.

So I guess if we’re going to talk about caching, I know that there are lots of different ways to cache something and lots of different ways that you might think of caching something. For the web devs out there who’ve done a lot of front-end work, my vision of how caching works is my web page loads and my browser says, hey, server, can I have this static content, CSS, JavaScript, image, something like that? Then the browser will hold on to a copy of that file, either as long as the server tells it nothing has changed, or maybe I can somehow detect, if I’m the browser, if nothing has changed, I’ll just serve that version of it.

This comes from back in the day when we were on 56k modems or whatever. It took a lot of time to download images and CSS and JavaScript, right? So if we can keep the file, if it’s not going to change, we may as well keep it and just reuse it. Then the page will load faster, in bunny quotes. It doesn’t really load faster, but it feels like it loads faster, right? So that’s one version of caching that I know we can fall back on as an example of what caching is. But you’re talking specifically about server-side caching, right? Which is different to front-end caching.

Jody : Yeah, it’s different but conceptually caching is what you just described. Meaning it’s a way to use a little bit more resources in terms of, for example, memory, which in the browser is either disk space because it’s then stored on disk or in memory, in exchange for CPU time. So basically it’s faster, to say it bluntly. So that’s the whole thing.

Then the real complex part is, okay, how to be sure that when something changes the cache is notified of that? So there are different techniques, different pros and cons in each approach. You have a balance to find. I will say that for most things in life, the key is always to find the right balance. Not go to extreme in one side or the other. So this is the same. The same is true for caching.

For example, how long should you cache something if you don’t know when it will change in the future, because nobody can predict the future? This is one of the main topics of caching and I go through it in the course. But apart from that, the idea is that you have some source. It’s called usually the single source of truth, which is usually a database. In the case of web caching, is the remote server. That is the authoritative source of truth and you get the information from there. Then instead of going to the source every single time, which takes time and resources and can be also brittle in case of transient issues, what you do is you use your own local copy. This is very, very briefly caching in a nutshell.

The thing I did with FusionCache is try to avoid people going through the pain that I’ve been through along the years, I have to say decades right now actually, with an easy to use solution where I tried to put the learnings that I got through the years into well documented, well modelled, hopefully, and uniform to use features and options and stuff like that. Basically I can go through any one of the features of the options that I created in FusionCache, and I can tell you why. From where that came from. Oh, that one time when, you know, we had issues with production and it was like what you can think of Black Friday or whatever.

We had to squeeze, I don’t know, one month of sales into four hours or whatever, and there was an issue with caching. How did we do? How did we survive that? Well, it was messy, it was bad, it was a lot of effort and, you know, stressful. Then I came up with, mmm, what if I can do something to make something like this easier for the next guy, which can very well be me the next time, but also somebody else? That is how basically FusionCache was born and what it is trying to do.

Jamie : So that’s, I don’t know if it translates well, but obviously British and American English, we say that necessity is the mother of invention, right? You’ve had, it sounds like, a lot of time in your career where you’d you’ve maybe thought, gee, I wish this would be easier to do with a cache. Then you’ve gone, wait, why don’t I just make that happen, right? Be the change I want to see, and then give it away for people to use, right?

Jody : Yeah, exactly. Exactly. I have to say shout out to what was back then the Stack Overflow Dynamic Duo, meaning Nick Craver and Marc Gravell. They were colleagues back in the time at Stack Overflow.

One of them, I think it was Nick, posted a blog post which was titled how we do caching at Stack Overflow, like 2019 edition or something like that. Up until then, I never had a single open source project. I never created one. I contributed a bit, you know there, you know, patches, small bug fixes, stuff like that. But I never felt like, mmm, I was in the right place to, like, say my thing in a way. But I wanted to contribute back more.

When they posted that blog post, it was for me like a validation, if you will, of some of the ideas around caching that I had for a long time. I already used those ideas, those patterns, those best practices, those techniques. But one thing is you use it for yourself, they work and you’re fine, and that’s already pretty good. But another thing is, you know, trying to put it out for others to use. You don’t want maybe to distribute the wrong idea, to do, you know, maybe it’s your own thing that works in your own way but not in other situations. You don’t want people to maybe be poisoned by that if that is not actually the right call.

So in that moment I remember reading their blog post and I was like, oh, so that was not so much a stupid idea after all. So that felt like a validation in a way. I was like, you know what? Okay, let’s just put my face on it, in a way, because in reality I hate my face away, but anyway, let’s pull the trigger, as it were, so to speak, and, you know, get out with my very first open source project. That’s how it was born. Then along time, people asked me about new features and then others I came up with them with, you know, real life experiences or something like that. But yeah, that is how it was born and how it went along in the last five years, already almost five years. Yeah.

Jamie : To me, I always enjoy listening to people’s origin stories, both in tech but also in open source, right? Because in both of those instances there is something that I can learn as an open source maintainer. Admittedly of something rather small, but as an open source maintainer I can think, wait, I never thought to actually sit there and think, just because I use it, does it need to be open source?

Those weren’t your words, but like, am I distributing the wrong answer to a problem, right? Because that just ends up with developer confusion, right? Because if I say, here’s this package, and I’m going to talk specifically about me here. I’m not saying you’ve done this, Jody, but if I’m distributing an open source package of some code and say, this solves my problem, and leave the sentence hanging there, then someone who reads that might think, well, if it solves Jamie’s problem, it obviously will solve my problem too, right?

So then that person will then attempt to use that or may attempt to use that open source project just because it’s open source and just because someone has said this has solved their problem. But that might not actually solve the actual problem they’re seeing. I like that. I hadn’t really thought about that. Does this solve the actual problem? Is this a good solution that everyone can use? That is something I’ve never really thought about. So I really appreciate you bringing that up.

Jody : Yeah, even though I don’t want to maybe give the wrong impression, meaning for people listening, don’t take this as a blocker. If you are thinking about releasing some open source project, do it. I mean, if it has a minimum of, you know, reasonable shape or it makes some sense, publish it. Maybe put a word like, hey, I use this in production or on a project or whatever. It worked for me, you know, no strings attached, but be careful. Maybe I’ll have some documentation later to explain how it works. But it’s not wrong to do. Don’t feel like your code does not have value or something like that.

Because honestly, I released it thinking, hmm, probably in two years’ time, nobody will have, you know, downloaded a copy. Maybe I will have, I don’t know, a hundred downloads or something like that. Maybe it’s stupid. There are already other amazing, honestly, caching libraries out there. So in a way I felt like who am I to get another one? Why should people get this one instead of another?

But of course it will not go this way every single time. But to me it has been an awesome experience, not just releasing it and finally deciding to go out with that, but also what came after. Not just in terms of, I mean, I’ve been lucky enough to get both a Google Open Source Award the first six months and the next year a Microsoft MVP Award. So that was great and I’m really thankful for that. But take that aside. The community that has been built around it a bit with time, with some time, like people commenting, hey, I’ve used that in my project and it solved an issue. That to me is the best thing, because when I know that somebody used my thing and solved an issue that they had, that’s like, okay, I helped someone. It was worth it. That’s it. So I would suggest people to try that because it’s really a good feeling.

Jamie : Yeah, no, I agree. I know that that’s me backpedalling against what I was saying, or it may sound like I’m backpedalling and taking my point back. But it’s not, it’s two sides of a different coin. You know, folks should always feel free to put something out there as open source. As you say, right, put a disclaimer that says, this is me just trying something out. If you want to use it in production, here be dragons, right? You’re on your own. I make no disclaimer that this will actually work. But yeah, putting stuff out there and just trying and almost like learning in the open is super important. It’s how I have boosted a whole bunch of my skills, just by going, huh, I wonder how this works. I could learn privately, but if I throw it up on a stream, on a live stream, or I throw it into an open source library, and then ask some friends who I really trust, like, does this make sense? Am I doing this right? How else are you going to get that code in front of them, right?

Jody : Yeah, absolutely, absolutely. The one thing I will say maybe is just be humble, because as cool as you may think you are, and you may be, not you Jamie, of course, in general people listening, there’s somebody out there who is definitely better than you at some parts of what you did. They may help and you can get better. Then in turn, you may help somebody else later. It’s a circle. It will go for a full circle.

For example, about this full circle thing, I’ve been inspired by, I felt like validated by the blog post by Nick Craver and Marc Gravell at Stack Overflow. Jump in time, 2024, they both were working at Microsoft and Gravell started working on HybridCache, which is the multi-level cache thing from Microsoft that came out with .NET 9.

We then had a discussion, well, multiple discussions, where we basically shared some ideas, or at least I tried to suggest some ideas, some, you know, things that we’ve been through with caching in general and with FusionCache as a project. So things like the shape of some features, or maybe it’s better to introduce this now than later, because otherwise it will be a breaking change. You know, small things like that.

The cool thing, and again, shout out in particular to Marc Gravell, who was the team lead with HybridCache, is that they listened. The cool thing that came out was that they not only created their own multi-level cache thing, but also they came up with an abstraction, meaning HybridCache from Microsoft is not just their own implementation. There’s two, it’s a separate package. But it’s an abstraction, like IDistributedCache, which is the shared abstraction for distributed caches. But in this case, IHybridCache is for multi-level caches.

That means that people like me in the community can create our own third-party implementation, which is what I did actually. So on top of, you know, having this collaboration with Marc and the team, you know, sharing of ideas and stuff like that. In the end, I created a small adapter for FusionCache to be used as HybridCache, the Microsoft abstraction, which is a nice moment of, you know, going full circle, if you will.

Jamie : Yeah, that’s really cool. So they inspired you to put out FusionCache originally and then Marc Gravell worked on HybridCache, which inspired you. Let’s just say it, right? I’m going to say it directly. He was inspired by your work.

Jody : I don’t know. I don’t know about that. Honestly, there are some common points, yes, but I don’t know if I inspired him or not. For sure he has been really, really nice in giving me some of his time to talk about that. I thought that could have been a good move for the .NET open source community when they announced it, if done right. I think that they did the main parts right, meaning, for example, again, the fact that it’s an abstraction and not just the only thing that people cannot, you know, do anything about it. That was a really, really good move.

I think probably that came from Marc’s past as an open source maintainer. Because just for people that don’t know him, I don’t know if somebody in the .NET space doesn’t know him, but what have you? Protobuf-net? We’re talking about Dapper, StackExchange.Redis. I mean, in the open source .NET space he is the guy. So yeah, the fact that he spent some of his time chatting about this new cache thing and how it could have played out alongside FusionCache instead of against, that was really, really cool and I appreciate that a lot.

Jamie : Then, like you said, full circle, right? So then you were able to contribute back towards FusionCache to allow it to be used alongside. That’s, yeah, I love open source innovation. Admittedly, if it weren’t for closed source, most of us would not get paid. But I love open source innovation, being able to go, hey, that’s really cool. Let me see if I can make something like that or make my thing work with your thing, because your thing is open and I can see how it works and my thing is open and I can see how it works. So let’s get this to work together, right? That collaboration. It’s amazing.

Jody : Yeah, and also, to touch upon more on that, you talked about my thing working with your thing. In the case of a multi-level cache like it is FusionCache and HybridCache, there’s the so-called second level, which is the distributed level. It’s not just one thing. It can be Redis, it can be Memcached, it can be Garnet, it can be this and that.

To talk with HybridCache, with a distributed cache, the distributed level, the second level, you need to serialise stuff, so you need to talk with the serialiser. So again, you need to be able to talk with Protobuf, you need to talk with JSON or XML or what was that, MessagePack, MemoryPack. There’s a lot of different projects that need to be able to communicate with each other, or at least that FusionCache in my case needed to be able to communicate with.

In the case of FusionCache, you can pick the serialisation format and the serialiser that you want. So you can plug in a distributed cache based on Redis, for example, as the second level, and the serialisation format with the Protobuf serialiser, for example, and all these pieces need to be able to, you know, communicate in some way and find the right balance to make everything work out. That is really great.

Jamie : It really is.

Jody : It really is.

Jamie : We have stepped a little bit away from how caching works, FusionCache. I love this conversation. But I’m just thinking that people who are listening in who don’t know much about server-side caching have heard you say multiple things like hybrid caching, multi-level caching, level two caching, things like that. I’m just worried that someone’s going to be listening going, they’re talking about stuff and they haven’t explained it.

Jody : Yeah, absolutely. So, if you will, I can give the two-minute intro to that. Basically, caching on the server side means the concept of the caching that we talked about before, like in the browser, but for data on the server side.

So imagine your C# web server, website or API or whatever. When you actually now without caching go and talk to a database to grab the data, there in that point you can introduce cache. What it does is you go to the cache. Basically you ask the cache, hey, do you have this data that I otherwise had to go to the database to have? If the cache says no, it can go and grab it from the database and then store it in the cache for subsequent faster access. This is the very short version. Now this is caching in general.

Then there’s the issue of, okay, what type of cache should I use? Memory, distributed? I heard those terms, but I don’t know exactly what that means. Long story short, a memory cache, you can think of it like a dictionary in .NET, for example. Because basically it’s a piece of memory where you put data, and each piece of data is identified by a key, the so-called cache key, which is again like a dictionary. You have key value and that’s it.

This is cool because it’s super fast. When you try to access a piece of data in memory, it’s a memory access away. So there’s no network call, there’s nothing. So it’s super, super fast. And there’s no dependencies, meaning you don’t need to spin up a database, Redis or whatever. It’s just a dictionary in memory.

This is cool until your application restarts for various reasons. Maybe it crashes, maybe you deploy the new version or whatever. When the application restarts, we have a problem called cold start, meaning you start and the cache is empty because, again, it was like a dictionary in memory. This is the first issue.

The second issue is what’s called horizontal scalability, meaning, okay, you can deploy your website or API or whatever on one node, but what happens when you are successful, hopefully, for your project and you need to deploy it on multiple instances? That can be multiple instances on one node, multiple single instance on multiple nodes, multiple pods. Today we are in a containerised world, so Docker, pods, whatever. In those cases, each application will have its own memory cache, basically its own in-memory dictionary. So each of them will need to go to the database to grab the data and then save it locally and that can create issues.

So the solution to that usually is, oh okay, I will now switch to what is called a distributed cache. A typical example is Redis. Redis is a remote service. So by storing the data remotely, it allows Redis to survive application restarts because the data is out of process. It’s remote. Not only that, but data can be shared between different nodes because, again, you can think of a distributed cache like a database. It is a database. This is, by the way, the way in which I very quickly explained things in the course.

You can think of a distributed cache as a database. The next question is, okay, so I have data in my database and now I need to put the data in another database. What’s the point? The point is that by nature, by design, distributed caches are simpler in features and in what they guarantee, mostly around ACID compliance or, for example, around persistence.

Some of them, like for example Redis by default, is remote but it stores data in memory. It’s super fast, but if you restart it, all data is lost. If you use it as a database, that can be an issue. You can cover that by enabling the persistence part. Otherwise, it’s just in memory. But if you use it as a distributed cache, you don’t care, because by nature a cache is temporary.

So by using a distributed cache, what do you get? You get to survive cold starts, so when the application restarts, and then horizontal scale, because multiple nodes can talk to a distributed cache and grab the data from there instead of having to go through the database, each of the nodes.

When you have something like the other day, the guy in the community explained to me his setup and he has like 20, 25 different applications deployed in pods, Kubernetes pods. In total he has something like 400 pods. Now, if you have 400 pods and you have 400 memory caches, 400 pods need to go to the database and the database load will increase and then you pay more in your cloud bill or whatever. That’s the whole idea of going distributed.

The problem now is, and I’ll close this off, the problem now is that first you started with the memory cache, so you created your code against that piece of code. Now you have to switch to a distributed cache, so you need to go around and change all your code to work with a distributed cache instead of a memory cache. That’s the first problem.

The second problem is that by using a distributed cache directly, you pay the price of network calls and deserialisation every single cache call that you make, because the data, since it’s remote, is basically a byte array, it’s binary. So you need to grab it from there by doing a network call and then get back a binary and then you deserialise it. So it is more costly than a memory cache. Still less costly than a database. Again, it’s all a game of balance, but more costly than memory caching.

So here is when a hybrid cache or a multi-level cache came into the picture. The idea is you code against one unified abstraction that can be either one level or multiple levels. In the case of hybrid caches, there are typically two: L1 in memory and L2 distributed. The hybrid cache takes care of the internal communication between the two levels, so between the memory cache and the distributed cache, which is the second level, such that you get the best of both worlds, memory and distributed.

How? The first access is always in the memory level. So if the data is cached there, it’s super fast. You do not allocate a single byte, it’s super, super fast. If it’s not there, the hybrid cache will say, hmm, okay, let me check the distributed level, which is shared between different nodes, and so maybe another node already populated it with the data that I need. If it’s there, it will take the data from there, take it down, deserialise it, and put it into the memory cache because the next access will be from memory, super fast. It will do all of this for you.

Not just that, but by being able to enable and disable whenever you want the distributed level, the second level, the cool thing is that you can start, you can start immediately using a hybrid cache, configure it with just the memory level, basically as if it’s a memory cache, and only later when you need to scale out, you can enable with one line of code during setup the distributed level. All of your code that is using HybridCache will not need to change.

This, in not two minutes but maybe four minutes, is the short version of why it makes sense to go hybrid and use something like FusionCache, HybridCache, or something like that.

Jamie : Okay, so let’s see if I understand then. We’ve had on the show in the past Jerry Nixon, who talked about .NET Data Application Builder, which we know already uses FusionCache, which is awesome. But he had said that it doesn’t matter how much, it doesn’t matter how you build your app, the slowest bit is always going to be the database, right?

Talking to the database, you have to go over the wire, right? Whether that is over to a remote server or a server on your machine, it is still having to travel out of your app to get some data to bring it back, right? So that’s always going to be slow. Whether it’d be slower to read from disk, I’m not sure. I don’t think I’ve done the maths there. I think it is still slower to read from disk, but doesn’t matter.

So you query your database, that takes a while to get that data back. So in order to make that more efficient, if I can cache those values somewhere closer to my app, maybe in memory, maybe as this distributed cache, that means I’m not having to hit the database as much, which means I can get the information faster, which is awesome. I fully understand that.

Then there is a hybrid model where I can have my cached data stored in memory, which is great. That’s super fast. But then if my app dies or I need to push a new version or maybe the pod that it’s running on falls over or whatever, I lose that data. So I want to maybe push some of that data out to the distributed cache, which is where the hybrid idea comes from. I can say to the hybrid cache, FusionCache’s hybrid caching, I need this to be stored somewhere, I need it to be accessible, but also the app may restart, so you figure it out, right? Then that way I’m not having to make that trip all the way back to the database to come back.

Like I said, maybe the database is on your machine, but it’s still a long distance to travel, right? We’re still moving electrons, right? So it still has to physically travel somewhere to come back. Then you were saying about how the different levels of caching is just like level one is our local in-memory cache, level two is our distributed cache.

So my question, I guess, distributed cache, is it called a distributed cache just because it’s not on my machine or is it a distributed cache because I can then maybe create copies of that? Like I can, you know, I can horizontally scale my app. Can I horizontally scale my cache? Is that something that happens?

Jody : Oh, okay, that’s a nice question. So in general you have to think about a distributed cache as one thing, just like you can think of a database as one thing, even though the specific deployment of one specific database that you have may have, you know, master-replica or primary-secondary or whatever. That’s an implementation detail of the infrastructure that you picked. Maybe your infra guy, whatever.

In the case of a distributed cache, the idea is that each instance of your application running on multiple servers, multiple pods, whatever, will have its own memory cache, which is, again, like a dictionary in memory. Whereas the distributed cache is one for everybody. Again, it’s like a database. So everybody connects to the same distributed cache.

If then this, let’s say it’s Redis, because it’s one of the most common, if this Redis is one server or it’s five servers in a cluster or whatever, that’s a different problem that is solved in a different way. Logically speaking, it’s one. So that’s the key part. So you have a thousand nodes or a thousand pods, a thousand pods will all connect to the same Redis instance as a second level. But the what I call the dance between the first level and the second level is all taken care of for you by the hybrid cache, by FusionCache in this case or another.

Jamie : Okay, so I have two questions. I know these two questions are coming up. Just because they are likely questions you get all the time.

Jody : Okay, let me see if I know the first one.

Jamie : So we’ve done a lot of conversations so far about, you’ve used Redis as an example because everybody knows it. So first question, super common question, I guess, is, so is FusionCache just like Redis? If I know Redis, do I know FusionCache?

Jody : Yeah, I knew that would come up. Yeah, so it’s not a matter of one versus the other. There’s no such thing because Redis is used most frequently, let’s say, as a distributed cache. FusionCache is not a replacement for that. You can use whatever distributed cache you want with FusionCache. So you have Redis, you can just use Redis as the second level, the distributed level. Any distributed cache can be used as a second level with FusionCache.

Oh, you’re using Memcached instead of Redis? Same thing. You can use Memcached as the second level, the distributed level. So it’s not one versus the other.

Of course you can use Redis directly, that’s a given, and not use FusionCache. But this is the scenario that I talked about before. If you use the distributed cache, you will always go there. As you mentioned, probably said by Jerry Nixon, a distributed cache, just like a database, is remote and that will always be slower than accessing something in memory, which is the first level, the memory level. So this is the thing. You can use Redis as the second level.

Jamie : Okay, so that was my first question. Is it just Redis? No. It can be used with Redis, it can be used with whatever you want. So the second question is, okay, so let’s say I have, and again, this feels like it’s a question you’ll get a lot, I have a thousand pods. You’ve just said that, right? We’ve got a thousand pods. We’ve got a thousand replicated versions of my app. All 1,000 go down at the same time. All 1,000 come back up at the same time, and they need to refill that in-memory cache.

So all 1,000 of those apps, hopefully you can see where this is going, all go to the database because we’ve got a thousand users, for instance. Each user has hit one of those individual pods and sent the same request. I want the same thing. I’m sending to the same API endpoint. I’m going to receive the same data. Now all thousand of those users will hit one or more of those pods which will have no in-memory cache. So then all 1,000 of those requests for the same thing will go to the database and come back. So what happens there? Is that literally a thousand times, like I need to go to the database a thousand times to get that one thing to then eventually cache it afterwards once everyone has it? How does that work?

Jody : Yeah, yeah. I thought you would come to that. So, in general, first, short answer, no. If you’re using a hybrid cache, like FusionCache, for example, which has two levels and you’re configured with two levels, so a memory and distributed, the situation you just described means that all of those thousand pods will need to go to the second level, not to the database, to the second level. So in our case, for example, a Redis instance, which is already way better in terms of performance and load for the database than going to the database. So they will go to the Redis instance.

But the real question, probably the real thing you’re talking about, is cache stampede. Did I get it right?

Jamie : Yeah. Yeah. Yeah.

Jody : Okay. So the very quick way to think about it without anything visual to see is this. Leave aside thousands of pods or whatever. You have just one good old web server, pretty old, like 20 years ago, whatever. Now a thousand requests come in to the web server for the same URL, for the same piece of data at the same time. The data that they need to fulfil the requests is not yet in the cache.

Without any special care, what will happen is that all of those thousand requests will check the cache, see that nothing is there, call the database, do the query, so a thousand queries to the database at the same time for the same data. Then get back the result, save it into the cache and then return the data back. Now, of course, the next requests will be served by the cache. Cool. But that first time, it’s called cache miss, so it checked in the cache when the data is not there. That will trigger a thousand different database queries. This is bad. This is known as cache stampede.

Now, how can you solve it? It’s very easy, but it depends on the library you’re using. For example, let me put it this way. If you do a get call to get the data from the cache, then check if it’s null or not, something like that, usually it’s if null or something like that, and in that case you go to the database and then you separately do another call to save the data in the cache, there’s nothing that can be done, meaning the cache cannot help you in preventing this problem, this cache stampede issue.

If, on the other hand, you call a method that is usually called, it depends on the library, but it’s usually called GetOrCreate, GetOrSet, GetOrAdd, something like that. Again, naming is hard and every library uses different names, but that’s basically it. If you do that and you provide this method basically the cache key that you want to get and a factory, it’s called a factory, but basically it’s a piece of code, a lambda in .NET, which is the logic that can go to the database or whatever that is, a remote source, and grab the data, you give that to the cache and you say, do your thing. In that case, the cache can help you. If the cache supports cache stampede protection, which FusionCache, for example, and even HybridCache by Microsoft, they both do, what they will do is they will coordinate the different calls to the cache and to the database such that only one will go, only one query to the database will be executed. This is called cache stampede protection or prevention or something like that. Some caching libraries provide this feature and some don’t.

Now, one common mistake is that people start to learn about this and go, ah, I know. If the caching library has a method called GetOrSet, GetOrCreate, GetOr whatever, it means then that it protects me from cache stampede. This is not true. So much so that, for example, one of the most used caches of all in .NET, which is MemoryCache, the one out of the box in .NET, does not protect you from cache stampede. So in the example I made before, a thousand requests for the same piece of data at the same time, you will get a thousand database queries, which is a waste. It could be prevented.

If instead you switch to HybridCache by Microsoft or FusionCache by me, even just configured only with the first level, so you don’t need to specify a Redis instance, set up something, it’s basically just a normal memory cache in a way, but it’s a higher level cache which gives you automatically cache stampede protection for free. You just change the method call and that’s it. You can forget about this problem forever. I don’t know if I answered your question.

Jamie : Yeah, no, that’s a great answer to the question. Yeah, I like not having to worry about something ever again.

Jody : Yeah.

Jamie : So I guess the lesson here is go read the documentation for the library you’re using, folks, and make sure that you’re calling the methods in the way that they are meant to be called.

Jody : Exactly. Check if it supports, in this case, cache stampede or something like that.

You know that moment when a technical concept finally clicks? That's what we're all about here at The Modern .NET Show.

We can stay independent thanks to listeners like you. If you've learned something valuable from the show, please consider joining our Patreon or BuyMeACoffee. You'll find links in the show notes.

We're a listener supported and (at times) ad supported production. So every bit of support that you can give makes a difference.

Thank you.

Jody : By the way, since you mentioned documentation, I care a lot about documentation. So the docs of FusionCache, I spend a lot of time, you know, drawing illustrations to clearly explain things and, you know, explaining, putting the text and then rewriting it to make it more approachable.

So if you are interested in these concepts, even leave aside FusionCache for a moment. Who cares? You will not use it. Okay, fine. If you go to the FusionCache documentation, I hope that you will find some of these things explained in an easy way. For example, the page about cache stampede, I worked on it a lot and I evolved it along time to make it very, very approachable. So illustrations and stuff like that. So the idea is you can learn something from it, even if you will not use Fusion, use something else. It’s good.

Just a quick note here, we talked about just FusionCache and HybridCache by Microsoft and by me. There are other really good caching libraries out there in the .NET, the .NET space, in the .NET open source space. I would like to just name a couple because if people are interested in this, I should suggest them to also take a look at these other libraries.

CacheTower, CacheManager, and EasyCaching. These three plus HybridCache from Microsoft and FusionCache from me, I think these are the five main caching libraries, multi-level, let’s say, caching libraries in .NET that people should take a look at because they are all interesting in some ways.

Jamie : It’s always worth knowing about the alternatives, right? Just because you’ve always used this particular library or this particular technology is not always the best reason to just continue to use it, right? Having a look around at seeing how other people are doing things is always a great positive thing to do. So I fully appreciate the shout-out to the other caching libraries. Definitely, folks, if you’re interested in putting some caching into your app, you need to know what your options are, right? So go check those out.

Jody : Yeah, I also put a comparison page in the FusionCache documentation just to know, you know, which features are there or not in this and that other library and stuff like that. So that may also be something interesting to look at.

Jamie : Amazing. I have to say that if the images in the documentation are anything like some of the images you’ve used in the NDC talk that I referenced earlier on, everyone will be able to follow those, I guarantee it, because those images are brilliant.

Jody : Thank you, thank you. Yeah, that’s basically my style. It’s also a way for me to decompress from coding. So you code, you code, you code and then, oh, okay, performance tuning, and then, oh, come on, documentation, and then, oh, okay, enough. I’ll draw something, I’ll draw an illustration. These are all hand-drawn and then I scan them and, you know, tweak them. But yeah, it’s a way for me to decompress a bit.

Jamie : Nice. So there’s lots of talk about caching today, lots of talk about open source. We have covered a little bit about what FusionCache is, how it works, how it relates to HybridCache and Redis and distributed versus in-memory and why you might want to put that in place.

So before we sign off, because I realise we’re running low on time, if folks want to learn about FusionCache, I know you mentioned the FusionCache documentation applies to not just FusionCache but has some generic caching advice and learnings in there as well. Are they just hanging off of the GitHub page for FusionCache or is there a different website that you have to go to to get to that documentation?

Jody : No, it’s just that. It’s just that. In the main readme file I link to the main topics there and I use that because it’s all work to work on documentation and I didn’t want to maintain a separate site. Honestly, the target is developers, so a GitHub repo is more than enough. So between the illustrations and the layout, I hope it’s as accessible as possible, even though it’s just a bunch of markdown files on a GitHub repo.

Jamie : Amazing. Okay. Obviously I’ll have a link to the GitHub repo in the show notes. So hey folks, just give that a click, a press, whatever action it is you do to navigate to that page. Go check that out.

What about if folks are listening in and going, this is great, I want to learn more about all of this stuff? Can they connect with you on socials and things like that? I know that some people don’t have, literally I’ve met some people in technology who don’t have socials because they just don’t have the time for it. So are you contactable?

Jody : Yeah, absolutely. All my DMs are basically open. You can find me on LinkedIn and Twitter, Bluesky, GitHub. Wherever there is a way to send a message, they are usually open unless I’m in a bit of a moment where spam is a bit too much. But as of today, not yet, so, okay. Yeah, contact me freely.

Also, if you’re interested, even before maybe going to grab the course, if you are interested, there are some videos, some videos that I made where they invited me, like you have been kind enough to do today. In the FusionCache repo there’s a link to a separate repo with the talks that I’ve made, so including the NDC one that you mentioned before and others. So yeah, you can start from there, all free. Then if you want to go deeper, maybe you can consider the course. But otherwise, it’s fine.

Jamie : Awesome. Okay. Then just to remind folks about the course, that’s on Dometrain, right? That’s getting started with, well, okay. At the moment, on the day that we are recording, Tuesday 11th of November, which yes, folks, we’re recording this instead of watching .NET Conf, the course that you have at the moment is getting started caching with .NET. There may be a second part by the time folks are listening, perhaps. I’m not trying to pressure you into releasing it by then.

Jody : No, no, no. There will definitely be. I don’t know the exact date when this will come out, but I hope I will finish working on it soon, let’s say a month-ish, something like that. It’s really a lot of work, so I cannot guarantee a date. But yeah, if this goes out published after the end of the year, definitely. Otherwise Nick will kill me. So Nick Chapsas from Dometrain, yeah. The first one is already out, the second one will be out very, very soon.

Jamie : Amazing. Okay. Coming soon. Watch this space. Awesome. Well, Jody, I really appreciate you coming on the show today to talk about caching. Obviously, you know, I did cheat, folks. I have watched Jody’s talk at NDC and read some of the documentation, so I know some of the words to use. But yeah, no, I have had an amazing time chatting with you and I’m walking away from this knowing way more about caching in general, not just FusionCache. But also I have more thoughts and conversation around open source, which I think is really good.

That conversation we had about open source where just do it in the open anyway and just tell people it’s experimental, it’s not going to, it may not be for production, I’m just learning how to do this, please don’t use this in production. Or maybe you want people to use it in production. I don’t know. Yeah, that idea is fantastic. Yeah, thank you for that.

Jody : Thank you very much for inviting me. I really, really appreciate it.

Jamie : Hey, no, I appreciate having you on the show, Jody. Thank you very much.

Wrapping Up

Thank you for listening to this episode of The Modern .NET Show with me, Jamie Taylor. I’d like to thank this episode’s guest for graciously sharing their time, expertise, and knowledge.

Be sure to check out the show notes for a bunch of links to some of the stuff that we covered, and full transcription of the interview. The show notes, as always, can be found at the podcast's website, and there will be a link directly to them in your podcatcher.

And don’t forget to spread the word, leave a rating or review on your podcatcher of choice—head over to dotnetcore.show/review for ways to do that—reach out via our contact page, or join our discord server at dotnetcore.show/discord—all of which are linked in the show notes.

But above all, I hope you have a fantastic rest of your day, and I hope that I’ll see you again, next time for more .NET goodness.

I will see you again real soon. See you later folks.

Useful Links

Jody’s courses on Dometrain
FusionCache
HybridCache
Supporting the show:
Getting in touch:
- via the contact page
- joining the Discord
Podcast editing services provided by Matthew Bliss
Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show
Editing and post-production services for this episode were provided by MB Podcast Services

S08E12 - Jody Donetti on Creating FusionCache and Collaborating with Microsoft on HybridCache

Sponsors

Embedded Player

The Modern .NET Show

S08E12 - Jody Donetti on Creating FusionCache and Collaborating with Microsoft on HybridCache

Supporting The Show

Episode Summary

Episode Transcription

Sponsor Message

Wrapping Up

Useful Links