Backup S3, Google Drive, iCloud, Notion with Plakar

Backup S3, Google Drive, iCloud, Notion with Plakar

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (14 сегментов)

Segment 1 (00:00 - 05:00)

What does it actually take to build a backup system from scratch? Not just slap together some Arsing scripts and call it a day, but engineering something that handles deduplication, encryption, compression, indexing, and restoring in one cohesive tool. Well, today I'm joined by Julian and Gilles, the CEO and CTO of Plakar, a new company, but built on nearly a decade of R&D in building open source backup systems from scratch. I am excited about this one because Plakar goes beyond just backing up your typical workloads like databases, file systems, and S3. It also can back up connectors like Google Drive, iCloud Drive, OneDrive, and even things like Notion Dropbox and imap. Yeah. You remember imap? Well, they've recently joined the CNCF, so we talk about their upcoming Kubernetes integration, obviously a part of this channel. And on the show I suggested that they build the ability to back up to Docker images, basically to OCI registry storage really is what I was asking about. And then like within a couple of weeks they went off and built that, uh, as a new integration. That's pretty dope. This is the one backup tool that I've seen in really, maybe ever, but definitely recently that not only looks useful for work and server workloads and clusters and clouds, but is something that I think I want to try to back up my own personal iCloud and Google Drive and notion and all those things. Which I, I've found historically are, is actually very problematic, challenging to find a free tool that isn't just cobbled together with a bunch of different things and something that's reliably able to be restored in a reasonable amount of time. So I'm, I'm excited about this tool 'cause I feel like it, it affects multiple parts of my job and life and it's open source. So let's get into it. We're going to talk about backups. We've talked about that before, but there's so much more to backups. We're going to get into it. I'm excited. you all started a company. how long ago? We can give two answers on this one, the first one is that the company has been incorporated in 2024. Yeah. it's a quite new company to support this project, but this project is, quite, old in the sense that Gilles did a lot of R& D, in the past 10 years on it, so. Yeah. What I was noticing as I was digging through the project was there are a lot of foundational things when you're creating a backup product that you have to define that a lot of us don't think about. I, the only thing I can think I can equate it to not being a developer of a backup product is similar to creating a new database product because you had to create a file format. streaming, backup go, I would imagine, much more low level than the typical application developer has to go because you had all these underlying fundamental concepts of, you know, things like the backup file, the caching of the backups, all that stuff that, yeah, yesterday I was at a meetup and I was presenting the product. Someone, asked me like, why did you go into backup? That seems like a very boring, area. yeah, and it's, if you take it from a developer perspective, it goes from very low level to very high level. It trusts all, like many fields of, computer science that you might be interested in. If you have, like a high appetite for tech. So you have to know about how a fast system works. You have to know how, how to manage your memory, how to manage high concurrency, how to manage, like file formats. In our case, we kind of developed a database in a sense because you have a bit tree. You have a sense of how to manage, match your bit tree to something. So yeah, it's very, complete as a project to, to dive into technical topics. I was like, Oh, this is going to be a small project, small side project. And then you realize that, Oh, you end up doing cryptography, compression and stuff like that. And you, like any area you look at, you're going to find ways to improve it and go further into tech. Yeah, I can only imagine how much time is spent, on like the engineering fundamentals of a giant file that you need to do various things with, because most of us don't deal with terabyte sized files on a daily basis. to me, the biggest files I have to deal with are model files, like open source model downloading and uploading, like that's the biggest thing I have to deal with. maybe if I was in an enterprise, I'd have big backups and stuff like that. I used to manage backups at a Government Enterprise, about 7, 000 users. That was 15 years ago. And I had two dedicated staff that worked for me. All they did was manage the storage and backups. Their entire job was ArcServe, I think we were either using ArcServe or NetBackup, but we had, you know, Windows machines, Macs, we had, Linux machines, we had mainframes, and it had to handle all of that stuff. And this was pre cloud, so we didn't even have to worry about How do I back up cloud storage or what we didn't even have S3 at the time. That wasn't a thing that much in the early 2000s, but it was so time consuming and such a nerve wracking effort to deal with recovery, which most people don't talk about. Like we don't spend a lot of time. when you're talking about backups, everyone's concerned about the backup part.

Segment 2 (05:00 - 10:00)

And I always focus more on the recovery part. And I get more excited about the recovery. Like how easy is it? How fast can I discover the thing that I need to recover? Because often that's the, trick. If you're backing up hourly and daily and monthly and weekly, and you've got all these incrementals all the traditional backup terminology, like sometimes you're like, well, we, that person needs to recover that file. But it needs to be the one that's not today, because that one was corrupted or whatever. So then you end up Like sleuthing through a giant caching system trying to find the one file or the one directory on one server somewhere amongst a thousand servers that you had to back up that day. And how do you do all of that reliably and in a way that you get, that two people can handle the data. And now I don't know any, I don't know any customers of mine or anyone who has two people managing backups. It's like a part time job for one person. So, Something has changed. When you tackle that issue of how do you find the proper thing to restore in a fast way, you end up realizing that you have to develop some kind of database. It's not just a backup, it's not just like gluing files together into some kind of archive that's going to be efficient. You have to actually have indexes, find things in an efficient way, be able to generate diffs between versions of files and do it in a way that can scale. Because it's not really just a volume sync. It's more, how many files am I going to have to look into? And, like a large volume of small files is as problematic as, I think to back up huge files, Also the performance, lots of small files isn't exactly performant on a lot of systems, but I mean, since I was stopped managing backups, we now have SSDs. So like I lived in a world where we had spinning disks and things were super slow and, you know, if you had gigabit networking, you were actually doing great, and, but times have changed. So when I look at this, like where's the elevator pitch that when I looked at the website, the thing I took away from it. It was one open source, run on my own hardware, on prem, wherever I want to run it, and it has this idea of integrations, which is not new, like most backups. You have to have like compatible, like it has to be compatible with this database file or this type of storage or this NAS or this, iSCSI thing or whatever, but in your case, it looks like the integrations are more cloud focused. So they're dealing with HTTP, but specifically different APIs. Like I saw Notion in the list, which I'm a huge Notion fan. I never thought about backing up my notion like that. Then now that I know that it exists, I'm now like obsessed about, maybe I should be backing up my notion. Like, why are the, how did this happen? how did the integration list happen the way it does? maybe to start and let you complete, Gilles, but, um, you realize that most of the SaaS provider right now, they are on a shared model responsibility. So it means that you are in charge of the backup. In the case of Notion, for example, they are not providing any kind of backup, and you have to do it by yourself. And when you look on all the SaaS that you are using for personal use are. Even for, you know, enterprise usage, you see that a lot of, you have a lot of all in your, resilience, or in the protection of your data and I think it was important to have, software that is able to, manage, I would say, the legacy tasks like, you know, backuping files, et cetera, but also be able to back up any kind of data. including, of course, everything coming from SAS. So I think at some point in the product, we decided, okay, it's not, a backup solution that is supposed to backup only files, but a backup solution that is, so it should be able to backup any kind of data and maybe you can tell a bit more about, you know, how did you do that? the, just to mention about the open source part, the main driver initially was to avoid having, vendor locking. Because, you don't have that many solutions that can back up many sources and that are not closed today. you have hacks, you have scripts that bundle a bunch of solutions, but you don't have one solution that you can trust. And, and it so happened that a friend of mine who has like a degree in computer science, so he's like fairly educated on the topic. He managed to lose all his data because he used a set of scripts that did not behave correctly. And, he did not realize because, everything seemed to be okay until the day his server crashed. And he had to rely on the restore part that everyone overlooks now. and the thing is, if he had a solution that was not a glue of multiple script and rsync and blah, blah, they would, this would not have happened. and now you end up having to look at what solution allow you to, a backup multiple source. And you end up Having to go generally towards commercial solutions, that will provide support for multiple sources without hacks. And they will usually have, some kind of closed format. So you have to trust that they will not go away or they will not bump their prices and that you can trust them on the long run. And what I wanted was to initially have a well documented format that We are going to be fully open with a license that prevents closing the code.

Segment 3 (10:00 - 15:00)

if we decided to go wrong, someone would just fork the code and it will go that way. so that, that's a safeguard against ourselves going wrong. Yeah. And then you have, what do you do with that? how do you manage multiple sources? And you realize that most of the open source solution, either there are, there are three, Fairly targeted at, at doing synchronization like rsync and they are twisted into doing backups through hard links, like tricks, or they have a high, file system, like they're highly built around the concept of file system. So you can actually do a backup of an S3 bucket, for example, but that's using a trick to map the S3 bucket on your file system. So they have, limitations And, they do not work well when you break this limitation, if you create a bucket and put 2 million objects in it and try to mount it to the file system, that's not going to work very well for you. There was a, like a disruption in how do you model this? this issue that you want to import various sources, you don't know these sources yet, and you want to be extensible and have a plugin system, so you don't even know what plugins will be written in a year from now, and make it fit in a model that will scale if you have flat, flattened data at the root. Designing this model, we came up with something abstract enough that you can kind of prove that anything can go in. And ourself, we work with that abstraction so that, like they all benefit from the same, same deduplication, same encryption, same, like same features without, Like when you write a plugin yourself, you would not have to think about all the details. You would just have to think about how do I get the data from this point to this point. It will do the work once it's there and in a very simple API. So our most of the work was done on that. Finding that abstraction that allows us to work efficiently, but assuming a wide variety of, of sources. And most of the integration that we have are, some are tagged stable and some are tagged beta because we are a bit hard on ourselves because beta does not mean it does not work. It means that, we want to show it works. And depending on the, how people are interested in that backend, we might drive that one, further in terms of, projection readiness. but they all work, to some extent. I can imagine like little edge cases of a lot of this stuff, especially when you're pulling and pushing from an API that isn't exactly. I have interesting questions. It's like, okay, with the notion, how exactly does that recovery work? And what if there's duplicate data, do you, where do you know, we all can conceptually, we've all, most of us have all dealt with file based backups, right? Same single system, same host. Easy, easy day, right? you're not even dealing with remote storage. And then, like, people tend to evolve into a, okay, now I'm doing, like, SMB mounts or something to put some files elsewhere, I'm doing low tech rsync or something. and then there's, this giant chasm, I feel like, which is, there's all those little utilities that are very niche and very composable, but you're, like, you're saying, you're building your own scripts, you're building your own, orchestration, essentially you're designing the orchestration yourself. And then from there to a complete cohesive strategy that uses one or two products maximum, you suddenly jump into like enterprise. There's a lot of enterprise backup garbage out there, I feel like, like there's a ton of stuff that. Especially when it comes to cloud APIs, I do this every year. Every year I have, I'm a small business of, you know, three to five people, depending on what year we're talking. And I, so I have some business needs. But mostly they're the sim the backup needs I have are like what a person would ha an individual would have. I have iCloud, I have, you know, Google Drive, I have Dropbox, I probably have STP FTP somewhere. I have Notion, I might have some S3 buckets, and I have Macs, and I need to, manage all these things. I have a an Ubuntu server in the closet, There are things and places, I would honestly love my GitHub Git repos to be backed up automatically just in case GitHub goes down and I need to move to, you know, GitLab or something. And when I look at, just for Google Drive or iCloud or OneDrive, any of the sort of top three cloud file drives or whatever you want to call them, They're really, I couldn't find a single product on the internet that I could buy for one person. It seemed like all the products out there that were like, yeah, we'll back up your company's Google Drive, because that's, you know, I have the company version of Google Drive and OneDrive and Those don't always work with all the consumer stuff, or if you're using CyberDuck or some other little utility. I was looking into backing up three people's Google Drive, and I was looking at possibly having to spend 500 a month on an enterprise piece of software because their minimum License purchase was like five users or 10 users or something like that. And I gave up. I eventually just gave up. I couldn't figure out a scenario that didn't require. A bunch of weird scripts with, you know, running cron jobs that would

Segment 4 (15:00 - 20:00)

probably never notify me in a failure that would run certain things. And it just was a mess. So you guys show up and suddenly I'm like this, I could do this in an afternoon and it would cost me nothing. Like pennies with, Plakar. Yeah. and the nice thing also about the, like our open source dimension, 'cause we are an open source first, company. Clearly whatever we do is open source. Unless it's strategically not good to do it on purpose, but that's the default thing. it's to provide enough libraries and examples to empower users to actually extend the integration. Our goal right now would not be to be like, we handle all the integrations ourselves. That would be more like, an integration that's fairly critical to companies. We would do it. to provide some kind of level of quality, then if users want to implement specific integration, we would like help them, get them forward because, like if you want to use one tool to back up everything, you have to have the manpower to do everything, which is not going to happen. and by making like some of the tasks we did today with my team was how do we simplify the API even further? So, so. People are less likely to even shoot themselves in the foot while trying to do something simple because that lowers the bar to being able to actually, instead of spending your time writing a script, that's not going to be very good, write an integration because it's as simple as that script. And it's going to be like a reviewed and you're going to get help from others. It's going to fit into one thing that actually tackles the difficult part. And that's where I would like to reach in terms of open source. How does this work? in terms of the development are I mean, all the integrations are open source, but what, how many of those integrations are created by the community versus the core team is it, I'm assuming this is led by feedback. Like people are asking for things. So then you're motivated to make a, integration for them. in terms of, how many were done by the community, I'm just curious, like the ratio. And now currently it's currently all of the integration were done by ourselves. we have pushed, a few months ago, we have pushed the SDK and we are trying to provide example and, you know, Simplify even further. but it's the community that's driving the decision about which one we do currently, for example. Like people have been asking for IMAP and gcloud and stuff like that. We're going to go spend more time doing that. but yeah, the idea is to start growing the developer community, not the user community, but the developer community into extending their own integration. Yeah. we reached a right level right now, a right level of, easiness, difficulty, depending on how you see it, of writing on integration because it's, it boils down now to writing one function that scans and allows you to enumerate your data. And provides you an accessor to the data to actually read it. Once you have that, it can plug into what we have and you get all the, benefits behind. Which means that some of the integration, like the Google Cloud integration was done in half an hour, unplanned. So one of the developers was like, Oh, well, I have a half an hour. I'll do that. And that's Okay. He has the knowledge, but you can assume that someone who does not have the knowledge will take more time, but he's not going to go from 30 minutes to a month doing that task. You're not reinventing the wheel every time you want to back up a different product and, cause I'm here for the Docker backups. I'm here for, image registries to be a, from and to. And for me, so I actually. years ago, I created a small script called Docker Backup, Volume Backup is like the name. And it kind of took off a little bit and then Docker ended up adding it as an extension into Docker Desktop. And then eventually they just made it a default feature in Docker Desktop. And so now like in the Docker community, volumes were never really meant to be moved around as images, but they're just files, right? So, I get more requests for fixing that shell script, if that's all it is, and working on that than just about every one of my other examples, and there's clearly a need for developers to have, to move or backup volumes on their local Docker system, or whether it's Docker or Containerd or Cryo or, whatever, it doesn't really matter. Podman, the developer sometimes wants. To move, you know, the database files that are on that Docker volume somewhere else. And there's not really a move option. Right. And there's no easy way you kind of have to learn all these different commands for extracting it out. Do you put it in a tarball? container image? Like all that stuff. So I'm here for that integration. So sign me up. I'm going to make you laugh. two days ago, I was having my sleepless night of wondering what I was going to do. I was looking into Docker because, we had a discussion a long time ago about, how could we benefit from our deduplication to, lower the size of storage for images instead of, layering layers, each of the layers could be deduplicated. And so I looked into it because I had never looked into how the

Segment 5 (20:00 - 25:00)

backup of this stuff worked. And they use tar as a format, yep. And we have a tar importer. Which actually can extract a tar and back up what's inside the tar, which means that you could back up all your images and have the duplications through them. And I look into how it's happening with containers and we can back up containers the same way, actually. and metadata, That's all it is. we have an integration that's not, you're ready yet because, it's a small experiment, but something we could push forward, which is, okay, we have already a tar integration, So we can create a Docker integration, which is actually, an integration that talks to the Docker API to get a stream of the tar that gets passed into the tar integration. And then boom, you have a new thing that's packaged and not using a script on the side, which is the goal. So really that's the point of the, of, Plakar initially is to allow doing this in ways that, oh, that stuff is not backed up. How can I actually back it up, you know, in a clear way without too much effort? Obviously, there's some dev here, but once it's done, it's no longer dev for other people. So, so there's that, but the idea is that then you, if you trust the tool, you trust that your Docker backup is working the same way as your SQL backup or your file system backup. yeah, there's even a scenario where, You could use, cause you know, a container registry is nothing but an object store really. And there's a, there are in the cloud native community, there's sort of a consensus around the container registry being the artifact of all things, like the storage of all artifacts. And so now we have all these different types, container images, it's just one type for an OCI registry, but there's all these other now file types. We can store Helm chart data in there. We can still compose files in there. Each one is its own. It's not a container image, it's just a registry artifact, and there's even utilities now that we use to, if we have a new type of object that we want to store in the registry, we can use tool utilities to create the metadata for all that. I'm not sure that necessarily a registry is a great backup storage location. Like maybe an S3 storage system would be better. cause all the clouds already have that, but they all also already have image registries. And so a lot of times when I'm working with teams, like if we're going to implement some sort of new backup or replication system, or like we, if we need storage for something, it's a lot easier for me to use what they already have. What my ultimate back end storage, I mean, file storage and S3 storage probably makes the most sense for the two types of storage for backups, but there's probably other scenarios like container registries, and I love the idea that when I'm on the site, the integrations can let me, it kind of clues me into which ones are inputs and outputs and which ones are both, and it made, and just staring at the options that you had made me think like, Well, you know, do, can I store, you know, Google Drive backups in OneDrive? And then, also store OneDrive backups in Notion. Like, I started to wonder, what's my, what's, what is my, I have a document that's the path of where all the things I need and where they all go for backups, right? If we're all talking about 3 2 1 storage for backups, like we've, we often as backup engineers, even of your own software, your own, stuff at home, you often forget a year later what you did with all it and how often it backups and where is it going. I tend to forget. And I know I'm using Backblaze in some places and I'm using a different cloud in other places. And I have to document all of that for my own sanity because every year I think I should check my backups and see if they're still working. And then I forget, I don't know where my backups are, how they work. And I have to go and redo all that research. So the idea that I could maybe Get closer to having this in one product is something that, I might have to, I might have to do and make a video on. So, let's get some Docker stuff in there. a break think the only managed service that we are providing to the community right now is to, you know, send some email if you have issue with your backup and a summary of, you know, what you are backing up. maybe we can cover this one because it will be always in your mailbox, you know, what kind of backup you have and where it's stored. So that could be nice. Yeah. And at this point, you know, once, once we have the image registry stuff, then you start talking about Kubernetes and can we run this on Kubernetes? let's talk about the storage for a little bit and get into the weeds of it because I'm going to, I'm going to pick, bring this up and hopefully this isn't, hopefully I'm not, trolling your issues. But I brought, I was looking to deploy this before the show so that I could come to the show and have feedback or experience to talk about and say, yeah, I got it to work last night. the first thing I go for being a Docker guy is I want to deploy the Docker image. There's an issue open that's asking for the Docker image and, I think someone replied and said, Oh, we're not ready yet. We've got caching, we've got other things we've got to worry about and we need to come up with a more cohesive strategy. So I'm here for that cohesive strategy. Let's talk about that. what are the challenges that you're seeing? And do you have a plan? Because we mentioned, we talked a little bit before the show, but it sounds like there's stuff coming.

Segment 6 (25:00 - 30:00)

So, not to spoil anything, but let's talk about it. It's just that the feature requests came very early. we had, like our first, server release that happened a few months before this. we had, the first user feedback and we were trying to find the priority, things to tackle. and this came and requires some, Thinking from the team about what it means to have a Docker image for this, because you have to actually mount your volumes, within the Docker image. You will not run this as an agent in most cases, or is it what users want? That's an open question. That's not an answer. That's, you have to think about how are they going to use that? Because the Docker image you're going to ship as an official one, you're going to have to support it in some way. You can't just. say, okay, we just raised the Docker image and it's not doing anything useful. And we had users saying, oh, well, we need to have a, this is going to be run from a CI. So it. loses its state every time. So you need to rebuild state. Okay. Well, that's going to be an issue because we need to have some persistence out of the Docker image for this. Some people saying, oh, I'm going to use this from a machine. Yeah. but that means you could have installed Plakar on your machine rather than Docker, because it's going to be a, like a lot of work for us to support a use case that's not that useful. Whereas you could launch Docker as an agent with Plakar in it to go query the other things because Plakar is flexible enough that you can run it as a, I'm doing my job from the Plakar instance. on my machine, but also as this is controlling the backups from other machines and transferring data here and there. I mean, depending on what direction you take, you would not build that image the same way. And that wouldn't, I don't know which one you would advertise the most. Yeah. Without having user feedback on this. It needs to have discussion and we need to have users from the community telling us that's what we need in the ary image. And that's what, Is going to drive the development. Um, that's more, most of the issues there, is there a concept of backing up the system itself, like the configuration and the plugin list, is it got an internal backup command that allows me to save, essentially save state of the whole system outside of the individual integration backups? Well, there's two, things. The first thing is that none of the state is, mandatory When you run instance, you wipe your, cache, folder is going to rebuild the, state. Yeah. So you are going to lose the plugins that were installed. But you can just click and reinstall. so that would be the idea is if you did not have backup for this, you are not in the last case because you just, you could go from a blank machine, you point to the repository, it will synchronize again, and you will get a state that's a working state for your backup. If you need to have backup because you want to avoid having to re synchronize or reinstall, well, you can just backup the cache directory and you get a snap, a Plakar snapshot with the configuration of your Plakar. there's no particular, like things that you would have to do to make this possible. It's just, there's just a standard way of using Plakar basically. Yeah. on the infrastructure side of the storage. you mentioned encryption, so it sounds like you, do you support encryption of backups? Is that? we have out of the box, the snapshots, the backups. we talk in terms of snapshots, in terms of a snapshot is a view of what the, whatever you imported as data, all these snapshots, they are compressed and encrypted by default. You have to actually, say, I don't want the encryption. I want to work with plaintext have to turn it off. Oh, okay. That's nice. Secure by default. I love it. it's end to end encrypted, so you don't have, you don't have a server, for example. you're going to run it from your machine, and you're going to say, my import bucket is on S3, and my storage is on gcloud. yeah. you don't have a server running at AWS and gcloud. So all of this setup is stored in the configuration of the store that you create. It is standalone. And we don't want to trust AWS or gcloud with keys. And we don't have a third party that would hold the keys to encrypt, decrypt with the strip. So we act with them as if they were what we call dumpsters. They don't do anything besides passing packets that they see and storing them, that kind of storage layer. Yeah. So at the lowest level, what if I do it? If I do the simplest thing, deployment because I'm an imp, I'm like, I'm primarily an implementer, you know, an operator. So I often think about, okay. is it going to look like when I have this thing set up? what were the pieces of the puzzle sitting and what do I have to run long term and what are the ports I need and how do these things connect? So I'm guessing that there's a daemon or I don't know what you're calling it, the server part, but like a daemon that runs somewhere all the time and it has an, like an API that I'm like, I can use a local CLI to control it. there's two parts to it. There's the, let's say the client part, which is you running Plakar to import data from a source and push it somewhere. And there's whatever storage you have, which may be a local disk or which may be an S3 bucket actually anything that, that can actually, take a key value, object, store, yeah. So that would be AWS and you don't have a server in between. You have Plakar operating as a client to, to your AWS bucket, for example.

Segment 7 (30:00 - 35:00)

so everything from the duplication, compression, encryption is done on the client side on the machine running Plakar. So when the traffic leaves the machine, you know that it can be tampered with without being detected. it can be decrypted without having the keys that have not left your machine. yeah. that's the most simple case. That's what you would do, on your home setup. You would install Plakar on your desktop and you would run it from the desktop saying the storage is there. It's on my S3. Then you have a different mode, which is, through the integration, you could have a server, you could have the Ubuntu machine you said you had on your, in your closet that could be running Plakar and taking care of connecting through SFTP to all the machines on your network and doing the backup. Then we have the non open source version, which we're working on, Plakar Enterprise, which provides a server that extends Plakar. So Plakar becomes the open source client to an enterprise. Product, but same tool as you would use at home and the enterprise version would provide a server that has additional features, like maintaining, privacy of the credentials for all of your storages, for example. So your clients at home would connect to your, your client on the workstation at work, yeah, they would connect to the Plakar server of your enterprise. And then that server would hold, the credentials, to the actual storages, to not link them through the company, for example. So you have all these different ways of working that allows you to have very flexible setups that go from, yeah, I have a mono machine and it's going to connect directly to my store too. I have. segregated traffic and isolated machine that have different privileges and cannot access that S3 bucket, but they could access this one. And I can't trust them to do that. So I have to have some layer of validation, as it's at a enterprise level. Yeah. is that allowing for like. Multihop backups or, I'm trying to think of some of my more enterprise challenging, like we had so many backups happening that, you know, one server couldn't do them all. So for bandwidth purposes, so we ended up with, in one scenario where there was like a main orchestrator. Server, and it had multiple backup, we just call them agent boxes, but they, their purpose was to back up the data and create the snapshots, but the, but they weren't necessarily backing up themselves. They were backing up other machines that also had agents, but the middle tier of fan out. need to back up certain amount of terabytes every 24 hours. if at the time, this was 20 years ago, but at the time we were limited to one gigabit network connections. So we were literally creating new servers in the middle tier because we were saturating pipes and we couldn't get enough backups from all the different systems in a 24 hour period. So we had to add more middle tier, but there needed to be a central orchestrator that managed the jobs that it was distributing to the individual middle tier stuff. But there, you know, there's a lot of small shops that I deal with where one person is saddled with DevOps, and ops, and backups, and recovery, and like monitoring, and logging, and storage, and cloud infrastructure. Like they're just having to do it all. I actually call them solo DevOps. that's the label I give to these unfortunate individuals that are given way too much, work to do. Maybe AI will help them. maybe we can, rely a little bit more on AI to help us with the advice on that. But it's just, it ends up being a whole lot. Right. And I saw that you had a demo on the website, so it looks like you're bringing that up. but yeah, I'm just curious about how big does this get today? And like, where's your vision for where the enterprise product that you're building is going? now, ransomware attacks are first targeting the backup system of, you know, any companies. So, for different kind of reason, encryption is required. And you need to be sure that your storage And the backup server doesn't have your credential of the encryption key. Otherwise, if your backup system is falling, at some point, the attackers, they have all your data in one. So, end to end encryption is key. Clearly becoming a kind of prerequisite for, you know, securing your backup right now. If we step back a bit you know, the issues that you mentioned about the size of the backup, we solved that in the past with deduplication, mainly on the, filers. So basically it was the storage that was, optimizing the space. Duplicating the data, but it's work only with unencrypted data because with Uncrypt data, you are of course losing, the duplication. So today we are in, in a situation where we have companies that. want to have end to end encryption on their data, but, they cannot, you know, in that case, using the deduplication of the filers.

Segment 8 (35:00 - 40:00)

And so storing the backup will have a crazy cost to make it happen. A lot of, vendors basically created kind of alternative with proprietary formats. where they are still optimizing the space, but at the end they have still the encryption key. And what we are trying to do with Plakar is to solve this issue. So basically, because we are doing the encryption, the compression, and the duplication at source, it means that, all around the path where the backups are going, they are already super optimized in terms of storage and in terms of space. And I think it's what we are showing in this demo website. You can see that we have, almost, 15, 000, cycle of backups that we did, snapshot that we did, on this machine. the logical size is 24 terabytes. So, we I have here a huge amount of data, but the space that we are using is only 159 gigabytes. Even if everything is fully encrypted and unencrypted, the storage has no knowledge about the encryption key. And I think It's a game changing thing of this technology, because it allows you to move your backup everywhere you want. You know, in any cloud provider, on premise, where you want, even if you don't fully trust this provider. And, but you can do that with an optimized, network cost because, you know, sometimes if you want to synchronize data between cloud provider, you have to pay egress cost, which is super expensive with huge amount of data. And because you, with Plaka, you will just pay that at the first, for the first backups that you are doing. And all the snapshots that will follow will only transfer to your server. the few blocks that were not backed up before. So the storage optimized, and it's fully end to end encrypted. today, I don't know so much, option to make that happen right now. Yeah. So it's doing incremental backups after the first, or incremental snapshots, I guess. there's, I will let Gilles, but we differential or incremental? Yeah. when you have a, an incremental backup, you actually create a chain of dependency between all your snapshots. The thing is, you get as, the more your chain goes without going through another day zero of sync. The more you increase the likelihood that you will have a corruption at some point that will break your thing. And you, Yeah. so you have a higher risk and you have to test everything very often because you want to limit that risk. Like you, you don't want to do, to go through the hassle of doing the incremental backup, just to not test it and test it in a week and realize that, oh, you have one week worth of. Deltas that are trashed, basically. Yeah. And the idea is that, you can also take an approach that is a index reference based, where basically what you're doing is not saying I'm building a delta against what happened right before, it's what's in the store as a global storage repository. So, Your backup actually benefits from any of the previous ones doing anything, and you don't have a chain of dependency in the sense that you can delete the one that happened yesterday. It's not going to break any dependency with the one you have today. as long as your store is, reliable, you can do any kind of removal that you want with the granularity that you want. And we can consider them as being, autonomous in the sense that each snapshot is, autonomous by itself. Does not require any other one. the thing is you have to trust the storage anyways. You're going to store your data there. If you don't trust it, well, you have to do something that's called three to one backup to actually ensure you don't have one copy of your backup and you can restore your broken backup from another backup. that's the idea. we have a cool way to manage it. And so this allows you to have all the benefits of incremental backups without the risk of incremental backup. Yeah. That's nice. with the sync command right now, you can actually super easily, synchronize a ClosetStore in several locations. So basically you are pushing one backup in a ClosetStore. And you can have two, three closet stores that are, replicating those data in different locations. So, yeah, for the, having all your backup in one closet. will be too risky. Of course, you need to have a backup strategy on top of it. And we are providing a cool way to make it by, you know, having this way to synchronize, with a again, low cost on storage and bandwidth, this closed store in several locations. So you are pushing in one place, and you are able to have two, three copies, even in cold storage, to be sure that your data, are remaining safe. If the first storage has some issue, but yes, with different granularities, because, you might say, oh, since it's encrypted

Segment 9 (40:00 - 45:00)

you may, you need to have the exact same copy, but that's not happening because the snapshots are individual. So you can actually, say, oh, I have one store, like on my NAS near my machine. I will back up on my machine in the local disk, just to have user error reparation, because I did it something, I have it immediately available. But I might synchronize one snapshot per hour to the NAS and have that one, span again, like one copy into AWS and one copy into Google cloud, for example, and the synchronization, it's not doing another backup. It's really pushing a copy of the snapshots through different, sources Possibly have different, encryption keys as well. it does, like trans, transformation between one, one to the other so that in the end, each one has its own encrypted copy of the same data. and, this saves from cases where, for example, you would have your machine, you want to back it up to two places. Yeah. You would run like natively. You would do, oh, I'm going to do a backup to AWS and to, to Google Cloud. Yeah. But in between. Something may have changed. You're not backing up the same thing. When you're doing the sync, what you're doing is getting the info from one of the stores and transferring it to the other. So at the end, they have the same data, which has its benefits. like if you lose something, it has benefits. And we repair the store, also, if we have a corruption, repairing the stores as well. if you break something, you can actually repair it from the other one. yeah. And you can also, of course, run some check on the store to check, you know, if the data is still, what you expect, in the store on two different way, right? And we have R&D, projects about, error correcting codes to auto repair, maintenances and stuff like that. just wish to be clear, the crypto, we did not do it ourselves. it's We are a team of people who have worked in security a lot. We have been facing specs a lot about crypto in banking and stuff. So we kind of had a hunch about what should be done on where. And we had an external, independent auditor with a famous corp. Cryptographer, where the book I have behind myself, was, okay to actually audit this with no buyout because it has no interest validating something that will be broken. So that was, just to have a third party. We managed to put cryptography in every layer as validation concept. You have HMAC everywhere, so if you flip one bit somewhere, it's going to completely break in the nice sense. It's going to tell you there's a corruption there. It's in that specific file, and this is collapsing because there's that file and that file that also shared that data, so they are all corrupted. So we have already the detection part in a very, granular way, in the sense that it can pinpoint very specific chunks and objects. And having that plus the ability to synchronize, we can, build upon, tools that would, that we are allowed to do a very, like pinpointed reparation. Without having to repair everything because that's costly too, we're going to be able to say, Oh, I have one chunk it's broken and I can fetch it from there. I'm going to fetch just that, that amount of data. We have, well, as I said, the error correcting codes, because since we can detect all that is broken, we can have on top error correcting code that could repair, auto repair. You know, in the same way, like repair in a buffer, verify that it's correct. You can check with other repository. Oh, it's very correct. So Yeah. do repair for real, apply this. So we have all these paths of, possibilities that we can implement that are not like that far away because we have branches that are working. They're not prod branches. Working right. now, but they're working enough that you can actually say, it's not just, an idea. It's something that you can actually, that was the focus, next month it would be there because there's enough, enough bricks to prove that. And, and we, have a ton of these ideas of, like what would be the tools to make it more reliable, in the sense that it's reliable that you would detect something is corrupted, but how can you make it. so reliable that people will not be stressed if that happens? But that's the goal in the, in the idea. I've been, as an architect in the past, I've been in so many incidents room over Slack with people that are, they lose their mind when they're have, they have an incident and the backups are hard to manage. Cause that's not, we know we have backups. Now we have to go into the backups and we never do that. So now we have to figure out how we go into that. how we, and if, One of them is corrupted, then, yeah, it's, stress plus plus. You're going to a high level of stress. And we want to be into the session where they don't do not face that. they, okay. There's a corruption, even your backup. Well, there are ways to get out of this. And most of them are automated. As you manage backups, like there's these three phases of the DevOps, the operations engineer that's managing backups, there's the implementation, which is obviously very time consuming, and you're learning the product, and you're testing backup and restore, so you can believe they'll work. And then once you kind of get there to your, the projects implemented, and you feel like everything's going to work in a recovery. You tend to leave it alone, right? Like you're checking to make sure things are going like, as new infrastructure shows up, you're adding or removing jobs or whatever. And so you're kind of in maintenance mode, but then there is that incident day. Where they call the backup person and they're like, okay, we need to bring you into the incident room or into the Slack team or whatever

Segment 10 (45:00 - 50:00)

because we now need a recovery. And typically, most of the teams I work in, like not everyone that can restore, right? There, there's only one or two people that can restore tool. And so in that moment. As I can viscerally remember being the manager of the people managing the backups and worried, starting to doubt everything, right? Like they're about to test the restore and I'm doubting like, did we, when was the last time we verified this type of this particular integration? Like we've had three major version upgrades and we've never tested since we did the initial deployment. So we don't even know if this restore will work. We recently had to replace three of the drives in that. So, is there a potential for some sort of disk corruption that we didn't know about because the files just sit there and they never get touched and they die slowly over time? there's so many, moments in that where I'm worried that someone's going to get in a lot of trouble or fired. And then the recovery happens and it works. Maybe it doesn't. there was one time where we ended up having corrupted files. And we had to go to offsite tape from like a month ago, because we had this process where we would go to tape once a month that would go to an offsite, it was going to a different data center in a different part of the state. So it was like a 300 mile, the goal was that no storm, if the storm took out the data center, that's the, of the three, two, one, that's the third copy, right? Like it's a state away. It's been driven there by one of our staff. We know that it's physically there. And we had to go pick those tapes up and they actually worked. But It took like a week. It was after a hurricane and we had a flood and we had servers underwater. And so we had to go to the offsite storage, And that whole week, I was just so nervous that these things weren't going to get restored. We were like, basically going to start from six month old data at best. And luckily, luckily it worked. But those kind of things, we don't talk about those kind of horror stories enough. you know, one of the reasons I think people are stressed is because, partly most of the companies, they don't have a backup team. They like the big ones The other ones have, they don't have, a backup team. People who are given the task to do backups, it falls on them, as part of a long list of other things to do, and they have to get rid of it, fairly fast and it's not a topic that they are interested in. they just like, yeah, you have to do backup before, before Friday. Okay. what do I have? There's only enough, there's a list of 10 tools. None of them seems appealing. I'm going to take. One that's popular because no one's going to get fired over a popular tool. That's going to be the decision driving. but then, if they don't have to use these backups and if it was like a task following up on them, they're not going to have a look at the backup once it's done. Like they will check that it happens, on the regular basis, because it's supposed to run every day. Well, they will check that it happens every day, but they will not inspect the data every day. Cause that's, they have other things to do here. The other thing is that in most tools, the backups are kind of dead data in the sense that they are meant to be backed up and no other use than being backups, you know, when we design stuff, we're more interested in how do you actually use the data, because what's going to happen is you're going If you have no use for data, for that data, you're not going to look into it. If that data that you backed up is actually usable in a very usable way, and you actually use it every day, then you have a fair confidence it's not corrupted because you've been using it the last few days. The demo website, that's just the open source version, okay? So that's not a company use that you would have of it, but we have previews of files. Within these files, you can preview the photos, but videos, you can preview audio. If you actually use that snapshot, which is a backup, it's a backup stored on the screen, for example, you actually use it in a way that you would use your Google Drive. Every day, looking into things that you actually manage and, Oh, I want to look at the content of a file, but I'm going to use a snapshot, not a copy that I have on my machine. Well, you know that it works because you actually viewed it, recently. yeah. And it becomes immutable data in the sense that you can't alter it. it's like a read only data, but it's read only data that's, that's you actually use and it makes that the data a bit less dead and a bit more lively. I think if you have a use case that way, you enter into an incident, you have to restore something that you've been looking, like you've been actually using the snapshot every day through a web interface or through mounting on your system as a local directory. Well, you're not as stressed because you know that, that works actually, which is, you've removed the, the painful part of the question of checking your restores. Yeah. is the check similar to a, like mock restore? Yeah. It's, it's an in memory restore that discards the data after doing? the cryptographic checks. So it's actually, if you restored in RAM and you validated all the checksums, but we do it in a stream way so you don't have to actually hold the memory for the snapshot. For the whole, yeah. Okay. Yeah. Cause it has to de, de dupe. Yeah. has to read the data we have a couple questions I don't even know if this is a thing.

Segment 11 (50:00 - 55:00)

Is there a plugin for Cyber attack detection. And I asked, is that, are you talking about like ransom, like detecting ransomware from encrypting everything? yes, is there something like that? is that a thing? And what is, what do you think about ransomware? Like, how do you do anything for ransomware or do you just We do some, no, we do something, but, the, like the position, the posture that people should have is the data is tossed. You have to have a copy elsewhere and that's, not, reachable by ransomware. Well, you have to have data that's offsite and not on the network. And that's the only way that you're sure that, well, sure, relatively sure that your SMR is not going to affect you. other provider? And then we have, okay, once we have tackled this and we have said to people, don't trust anything else than this solution, then there's all the solutions that are like, best effort. like for example, we have, entropy compute, the entropy of files and directories and we store this as part of the metadata of each snapshot. So you could actually use a diff, like a paper diff, way to, To compare if the entropy drastically changed between two snapshots, for example, this directory that had that low entropy before has a very high entropy now. The thing is the stores that are pushing the They're Right. on this. So, so you're not, ever editing something in the source. So they can be, actually you can have warm and forced, at a, at your provider level. If you have entropy checking, plus, the offsite copies, offline copies, you kind of have. Like a fairly good, situation because things that would not completely trash or store, you can still manage to say, Oh, I had the machine that has the ransomware. It's pushed back up with ransomware, but the others, the snapshots are not affected and it can actually, remove the broken one. Because they're immutable. Then if that did not work, then you can go back to, oh, I have a offline copy, I have a offsite copy. You, so you have to manage this for you. You just can't trust a software solution to take over somewhere. Yeah, I like the entropy idea though. Like, you're basically talking about if the change rate on this particular backup is normally 10 percent a day, having something that notifies you when it's, you know, double that, 20 percent change this today or whatever, and some sort of Yeah. that your And you will probably have an alert on the size also, because of course if you know everything is encrypted, usually, you know, you can do with Plakar something like 10, 000 cycles without increasing the size of the storage. You know, you can, yeah, increase the frequency because we are just storing, you know, a few metadata, only the changes between two snapshots, so, and so you can virtually make your backup from, you know, every day to every hour, every minute, depending on the size of what You, are making. if you have a ransomware, you will have an alert on the size because it will double at some point the size of your storage and it's, something that, should never happen, so. Yeah. you have that, and you have, you have the idea that we, as I said, like very early in the interview, we have built some kind of database in some sense. So we have multiple indexes. we can, as you saw in the demo, we can look up images or videos because we also index MIME types and stuff like that. And the MIME types, they should be aligned, in some way to the entropy of the data, if you have a text plan file and it has a high entropy, you're going to raise alerts. That's not, that's not great. So you have many, these are the few that come to my head right now, but there are many, other ways, other heuristics that you can use to actually detect some kind of a fishy scenario that would gradually, take place because you're, if it's already, if the ransomware is already there, you should know, because you are asked to give money. But if you're in the middle of the attack and you have. a backup that's happening in this app that have half the data corrupted, half the assets are corrupted. You're going to detect it through entropy, metrics like this. I feel like if I had to make something myself, it would end up being something that was so stupidly simple. Like I'd create a monitoring solution that watches a plain text file. That's like, don't encrypt me. txt or something that I put on every single file share, every single server. And if any single one of them ever changes. I get an alert, like I have some sort of agent that somehow detects all of them and it's the first level. it doesn't even wait for backups to happen. It's just like, Oh, this file just changed. cause the way I've seen these things roll out, these ransomwares is it starts small and then just spirals. So there are early indicators in the early hours, because If you've got terabytes of file storage, like that doesn't all happen at once. And it doesn't, and not everybody has permissions to everything. So it typically starts in little places. So I'd probably like seed all these little files everywhere. besides the smartness of the discussion, There's only the offline backup that's going to save you. There's no other way that you be, it's, it takes only one miss, to, the ransomware, if you misdetect and you let it happen, it's already done. you don't have the luxury to, to try and see if it works.

Segment 12 (55:00 - 60:00)

Yeah. so you have to have your own offline backup. That's the only solution. And then the, everything else is nice to use that you could have. But it should not be a blocker to have the most annoying part, which is the offline backend, which is the one that takes the most effort to produce, Yeah. Well, I like those read only S3 buckets. Those are something that I like to use for, ensuring that files can't be deleted. My backups can't be changed. you know, did you hear about the Unisuper incident where Google, they just deleted. All the complete az, and region of UniSuper, and they lost everything because they, you know, basically they dropped the billing accounts of the client and it cascaded everywhere. So, yeah, I would not rely on S3. It's, Oh, for sure. I just mean, unlike a normal file server or any, any drive storage or anything like I can more easily ensure. Things that are written to, buckets don't get changed later. Whereas like everything on a file server is, you know, up for debate on whether, what can access it. but yeah, good advice. one other question. Gartner recently introduced the cloud native infrastructure recovery category in their latest hype cycle for backup and data protection technologies. Where would you position Plakar in. Cares, I guess that's the acronym for Cloud Native Infrastructure Recovery something. Let's say that we announced, last week or this week that we are joining, you know, a sponsor, Linux Foundation and CloudNative, Direct Initiative, I don't know, yeah, so basically we are joining those two Sandbox maybe? Are you going to go for Sandbox? when you donate, are you donating or are you just becoming a member? we are, donating basically to, yeah, be part, of the foundation, and because why, it's the first step, you know, to, who to put there and to understand how you know this ecosystem is working right now because we have to admit that Jill is coming from the BSD world and I don't have so much experience on this one so it's we have to you know figure out how we can be integrated and it was the first step but yeah we are working currently on the support of Kubernetes. we hope, for Cloud Native Paris, the 3rd of February that we have something that will be, usable. and, yeah, we really think that at some point, a layer is missing in the, in Cloud Native about, you know, residency and backup. and that, you know, with Plakar, we will try to contribute, you know, try to bring, up This, layer and be sure that, whatever, the data that you have to back up, you will be able to relay on this layer. My previous job, you know, I was managing quite large team with a big e commerce company in Europe. And I was fighting every quarter to be sure that all the team made their backup at some point. But the things I never achieved is to be sure that all the team has 3 to 1, Eclipse 3 to 1, you know, encrypted everything. I think what is game changing with Plakar right now is that we decoupled, the storage, from, the technology to store basically your backup. and you can store your backup anywhere. without trusting your provider. So what does it change? you could imagine, and it's what we are releasing right now, a protocol where you can push your backup to a provider, and the provider is managing the resilience of your data without any kind of knowledge of your encryption key, while we are maintaining low network cost and low storage cost. And I think it's, you know, the kind of layer that is missing right now. Be able to backup, all your objects, basically, whatever is it. just pushing it to a third party that could be, you know, your own company. It could be a team in your company that is managing, the backup. But today the issue is that pushing all your data to one team in your company with the encryption key, etc., to optimize the storage, it's a big bottleneck in terms of security. So, yeah, what we are enabling with this new protocol is being able to, you know, asking to every team, make your backup, push this backup to a third party, internal third party or external third party, And this third party will manage resilience without any knowledge about your data. So being able to make two copies in two different cloud providers, for example, one offline, etc. And doing it on a clean way. and I think it's, the kind of contributions that we can bring to the ecosystem. But, yeah, of course, we want to, do something to solve all this resilience issue. the way you're describing that, I can't help but think that an OCI registry would be a good option for that because it's content addressable, it's SHA hash guaranteed, unique identifiers, it's read only, so you can be ensured that there is integrity, it's got all the metadata to it, so, I'm going to put my vote in for that, but it sounds like you're building something custom. So, so I was going to ask as we end this up, cause we're running a little long.

Segment 13 (60:00 - 65:00)

what was next? It sounds like what's next is Kubernetes initial Kubernetes support. When you say Kubernetes support, are you talking about running it on Kubernetes? Or backing up like Kubernetes volumes or is it both? we had a discussion about what was the proper way to integrate into Kubernetes because, it's one of our developers that's working on it. It was like, should I do one integration that works? That covers everything. And I said, no, we have to decouple, control plan and the data plan. You have to be sure that, I want to be able to back up all the YAMLs from my configuration. And I want to be able to selectively back up, some of my volumes. I don't want to have, no option, but to back up everything or nothing. So, so these are either two, two separate integrations or one integrations operating in two different modes. but the idea is to tackle all of them and, we were looking into his, looking into, Valero integration. and it's tempting to go your own way always, let's do our own integration, but there's also a pragmatic way, which is, no, if we can adapt to, to be run by Valero. through Valero. Then you get all the possibilities. You can get, first our simple integration to back up the kube configuration, which can also, be used through Valero to, to fit into the existing, setup of people. They can just swap between different solution. They can test us while retaining their old solution for whatever they're using. Valero, then we can have a third way of doing it, which is our own, but that would come last basically. But Yeah. just to say we're planning on not doing, being just a session that runs within kube, but more as a session that also manages to backup your kube. Yeah. That's one of the challenges I've been seeing in the industry is like, You've got like the traditional backup vendors that, that have the plugins or integrations or whatever they want to call it. and, you know, you're paying lots of money and they make you pay for certain things like maybe the Oracle integration is costing extra and you know, that kind of thing. And they're all closed source. And then you have these open source things like Valero and. But the challenge with it is it's just Kubernetes. it's great at Kubernetes, but it's just Kubernetes. And typically, I don't really work with any teams that are only Kubernetes. I mean, even if they're Kubernetes first and they're container first, they're going to have other things. And so then they have to have a completely different set of tools for that stuff. And those, then these two things don't meet, right? So, so Valera's backing up to whatever storage you want to put on the plug in on the back end, and then this other system is completely separate, and you've got, and, not that we can ever, I mean, you know, at this point it feels like all my clients have multiple CIs, multiple backups, multiple clouds, like there is no just one thing. They've, they're doing everything multiple times, multiple types of databases, multiple different database providers, and, So the challenge I always feel like isn't to get to one universal backup system. It's to get as just as few as possible so that you can maintain yeah, that, that's the goal. But if you were like, okay, imagine you have 10, 10 separate tools, because that's what I saw at some previous companies. They have, like many teams, none of them have, came up to a consensus about what were the proper solution. Each one came up with its own, you end up with 10. Even if we just reduced to three, that's a net win over having to manage 10, 10 different solutions. And in our sense, You can do something very stupid, like stupid, stupidly, easy. I mean, you can say, oh, I want an integration that actually backs up to another system, another backup system. So you end up having everything falling into Plakar through the system of integrations, being able to ingest data from whatever solution. So there's also this, I'm saying it's a possibility through the integration system, but that means that you have also a way to progressively de plug. Unplug, older solution as you manage to have integration written, but still, be able to have everything from day one into Plakar. you know, you want to back up some solution. We don't have that integration from that, but we have the integration for your backup system. Well, you can back up through the other tool. We back up the result of your backup. Progressively as we have the integrations to, to manage your tool natively, you get, some tools out of the way. the idea is to allow people to do that. obviously we're not enough people to write the hundreds of integration that we would need, but having simple SDKs, providing good examples and starting to do the most, like the most popular ones, will lead us there ultimately, that's the idea. yeah. And that's how tools like CyberDuck, CyberDrive, like that whole, you know, Project ecosystem, you know, like dozens of different storage integrations. I like using those tools because they've got GUIs, they're user friendly. They're really great for like personal backups, personal file management. and that tool, the magic of that tool is that it works with like everything. it, like every cloud storage scenario you could think of, it's got a plugin for that. And so I feel like the integration or the plugin ecosystem is a lot of, in a lot of ways, the magic of what makes a backup product or a backup project, and really interesting is the ways, all the different things that I can backup up just in case.

Segment 14 (65:00 - 66:00)

I didn't see a GI repo option for GitHub. that's maybe more of a, maybe we, instead of doing it with, I guess if you do it with Git, then you could do any of the GI providers. But it might be better to actually do it through the GitHub API and do it. it depends what you want, because on GitHub, what you would want is probably not just the code. It would be all the issues and all the, right. that's where like two levels, right? Yeah, it's like I need the code, but I also could really use all this other stuff. Yeah. well, this is great. We could talk forever and I really appreciate your time. You've both been very generous with your time. we cover lots of topics in this hour, and I'm, I am excited to start playing with it and get started. I am excited to hear about, what you're going to do, announce on the Kubernetes side. That's where I live. So the Docker and Kubernetes stuff, I'm going to subscribe to any issues that have those words in them, so that I can keep track of what the status is. how do people find you? So We've got the website, are on discord. So and then you got the GitHub repo. you got socials, looks like you have a discord server. So everybody that wants to get involved. We work on Discord. we are, as I said, all remote. We're all working remotely and we work transparently on discord. So you can just come to our discord. you can actually attend all of our meetings. You will be muted,, but you can actually, look into any discussion, technical discussion that happen, in the open. Except the daily, you can come. and talk with us during the daily yeah. all right. So we know what's next. We know you're going to be at KubeCon. People can follow you individually. I guess you guys are on socials, on LinkedIn. I think in the YouTube description, all the links are below for how to follow these two fine gentlemen. Well, thank you for, for having us. This was, pretty great. very much. you. All right. Well, thank you both for being here. And we will see you next time here on, DevOps and Docker talk. Ciao everybody!

Другие видео автора — Bret Fisher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник