2025 Recap: so many projects

2025 Recap: so many projects

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (7 сегментов)

Segment 1 (00:00 - 05:00)

I've been working on so many projects in 2025. I thought it was important for me to make a recap, if only just to clear my head. There are many, many things to go through and we don't have a sponsor today, so I'm just gonna start right away with Facet. Facet is a project that I started working on in March of this year. That's right. It's only been 10 months, yet it feels like an eternity. I think the initial driving force for Facet for me was I'm sick and tired of waiting for things in the Serde Cinematic Universe to compile. I've spent many, many moons looking for opportunities to make builds faster. I've written custom tooling for it. I am in the top results every time that I search for CI optimization, which is not where you wanna be. Do not gaze into the abyss, lest the abyss start texting you at 4:00 AM asking if "u up". Basically the idea is that serde is highly generic, right? Driving the serialized or deserialize trait for a rust type already generates a bunch of code, but you don't really pay for it yet because it's generic code. It's only when you actually call serde_json:: from_str with your type that the generic code gets instantiated and then rustc and LLVM do their best to optimize all this, and that can take a long time. My first attempt at tackling this merde, had serde-like traits, but mine were dyn compatible so that you could do dynamic dispatch instead of monomorphizing everything. That's what you call instantiating generic types with concrete types, and that's why it can generate a lot of code. Makes build real slow. However, reading this today, I realized I just made a shittier version of erased_serde, so that was a waste of time. Facet is my second attempt, and it's based on the realization that it's much nicer to implement serialization on top of reflection than the other way around. Instead of having these visitor patterns that drive serialization in serde, you get access to an associated const called SHAPE for every type that implements facet. In that shape, you have information about what kind of type it is. Is it an enum? Is it a struct? You have information about which traits are implemented and how to call different methods. You also have VTables for lists, maps, and sets along with information about fields, their offsets and their own shape. From there, things kind of snowballed. It's like, okay, we have the information. We must have a nice API to read from existing values. I call that Peek that gives us serialization, but we also must be able to build values from scratch using reflection. And that one's super tricky because you are dealing with a partially initialized object. You're dealing with states like if it's an enum, have you selected a variant yet? Have you initialized some of the fields of the variant payload? What happens to those fields if you switch to a different variant now? It's a minefield of undefined behavior. Potential memory, corruption, et cetera. I would never have embarked on this journey if it weren't for Miri, which catches a lot of undefined behavior and some defined behavior as well. Good news is once you've written that on safe code, you can write so much stuff on top of Now, deriving facet on a struct is just a bunch of statics and maybe a bunch of, trampolines or adapter functions for the VTables. And each format create like facet-json, facet-yaml, facet-postcard It's just one crate that works with every type that implements Facet. There's no generics involved. It's all just reading the shape of types of runtime and acting accordingly. Now I knew this was gonna have a cost in terms of runtime performance, obviously, but I didn't know exactly how much initial tests showed that facet json was between five to seven times slower than serde json. And I was reluctantly okay with that. Like it's still the same order of magnitude and I was still hopeful that at least you'd, you'd be faster to compile, you would get in build times, binary size, et cetera. And then I measured build times and I measured binary sizes, et cetera. And it turns out that not only was it slower at runtime, I was also slower to compile and bigger in binary sizes. So when I made the video of the announcement of Facet, I was like, well, things are not exactly what I would like them to be right now, but I have to say the truth, like you have to share the numbers, right? Otherwise, what are we doing here? Knowing the situation's the first step towards improving it, and that works for everything, but I kind of aid into my enthusiasm for a while there, I was just focused on other projects. I think. And in October of 2025, someone started porting their code base, big proprietary code base from serde to facet, and they encountered a million bugs, which I told them to report individually on the issue tracker. And so we had, I think a couple weeks of back and forth like, oh, there are these four new issues, and me fixing them as fast as I could. And one of the issues was build times. The build times got worse, switching over to facet. Part of the reason is that facet generates a lot of code and part of the reason is that it's really hard to completely switch away from serde and syn and other crates like that because they're so prevalent too. You might still be paying for them somewhere else. Maybe tracing pulls them in. Maybe you have a derive macro somewhere. Maybe some crate uses s json internally just to have a value type. So now facet, which isn't free is on top of the ecosystem you're trying to move away from, and the build time increases. And it's during those two weeks that I decided, you know what? We're not gonna try to be the smallest, fastest to compile, fastest runtime crate. We're just gonna try to be the nicest in terms of developer experience, which we call DX for short. I started focusing on adding features that would be nice to have, like best of class error reporting for parsing errors using miette, streaming deserialization from AsyncRead, etc. I started working on facet-solver, which gives you the best error messages for untagged, enums, and flattened trucks and everything, Things are really hard to resolve, like you need a complete view of the shape of the types you're trying to de serialize. And you need to know everything that happened up to this point. And that's exactly what facet-solver does, and it does it for all the format crates, not just for JSON, because we have so many format crates. We have JSON, YAML, TOML, Postcard, MsgPack, XML, KDL but also now SVG, HTML, CSV, XDR, query strings, command-line arguments, ASN. 1. Features tended to drift between crates, like you implement something about untagged enums in JSON and suddenly YAML doesn't support it, so you have to fix all crates individually, which is aggravating. To combat this, I introduced facet format, which is the successor of facet

Segment 2 (05:00 - 10:00)

serialize and facet deserialize, and it's the basis of all format crates today. That meant rewriting all the format trait again and then renaming the old creates to legacy and the new creates back to the old names, and they're deleting a hundred thousand lines of code in one pr, which I posted on social media Because like how often do you get to do that? And there's probably a hundred regressions I haven't found yet, despite careful planning and execution. But that's a problem for, say it with me now, next year. And in the middle of adding a lot of features to facet to make it the nicest thing ever. I decided, you know what? No, it, no, no. It was supposed to be lighter. It's bullshit that it generates more code than serde and it's actually slower to compile. I don't like that very much at all. I think I don't like it right now at all. So I started working on reducing the amount of code generated. There's a handful of tools you can use to do that. One of them is cargo-llvm-lines. But you can also use -Z macro-stats, unstable rustc flag. And of course you could do rustc self profiling. I've blogged about this before because of course I have. And using those, I was able to get to a point where a bloat benchmark using generated struct and enum types was actually faster than compared with facet, than with ser. That was a couple months ago, but I checked just before writing this, and apart from a couple dependencies that slipped in by mistake, things are pretty even still. More importantly, there is now tooling to compare facet against its past self in terms of lines of LLVM, intermediate representation generated can compile times cap size, et cetera. I made a little text user interface, A TUI to look at all the vectors which are tracked in the repository. The measurements are done on a synthetic code base, but I've gotten reports that it does translate to similar results on real world code bases. It is moving targets though when things evolve as we add more features. Another thing I said in my trying to be optimistic announcement of facet initially was like, well, okay, it's, it's slower at runtime, but maybe we can do JI just in time compilation and then, and then that it, it might even be faster. And for a long time I kept using that as kind of a shield or a comfort excuse. Like, yeah, it's slow, but that's because we haven't tried doing the real thing yet. And I felt like it was dishonest and I was tired of using it. And I just decided to make it happen with cranelift and see if we actually can get faster. And at first I made what I'm calling Tier one JIT that all formats can benefit from. It's just instead of assigning fields via reflection, you just generate code that does it directly for you and you get some decent performance benefits just from doing that, but it gets even better if you go to tier two JIT, which is the format crate like facet-json knows how to parse JSON, therefore it knows how to emit instructions that will parse JSON, and then you have facet format, which knows how to emit instructions to construct types and assign their fields and pick enum variants and whatnot and drop them if needed. And so you combine the two and, uh, that gives you a lot more performance. There are caveats, of course, we don't get the auto vectorization that LLVM can do, like automatic usage of, uh, SIMD instructions. We don't get to inline calls into the standard library. that one sucks. So we have to be smart and do things like using staging buffers and then do vec from raw parts. for hash maps, we can build a slice of key value pairs and then build a map in one shot with from_iter. There's a ton of little tricks that went into this, but as of today, I'm happy to report that if you're okay with somehow, depending on cranelift at runtime, while having code generated that is nigh impossible to debug, might contain undefined behavior and crash your program and whatnot, you can now beat serde while staying in the facet ecosystem, at least for JSON and postcard. Postcard is twice as fast. I'm so happy about this And I know because I made a performance dashboard that tracks face adjacent versus serde. Jason using Devon for benchmarking times and gun ground for benchmarking instructions. Every time that I reviewed the script, everyone has asked me like, is that a typo? It's not. It's just a thing based on Valgrind I guess. I guess they're both Swiss German or something. so for both JSON and postcard, tier two JIT is usually faster than serde, Uh, who do you use it? Like I said, there's a bunch of caveats. For me, I don't mind. I think it's funny, I think most programs would be like we can take the performance hit of reflection or like we'll just stay using Serde or a thing I haven't explored yet fully, but I know would work is doing codegen from facet information, essentially you move all your types into a crate and then that types crate becomes a build dependency. And then you can, run a build script that reads the type information using facet in the associated const SHAPE, and generate code from there. I actually do that for some other project I'm gonna talk about. In the meantime, I'm working on lots of rust crates. different formats. I noticed in my rust docs that blocks of KDL or TOML or whatever are not highlighted on docs. rs and that makes me very sad. Meanwhile, I'm also working on some private proprietary projects that I'm not gonna talk about and I need to do syntax highlighting using tree sitter, which is great, but I always have to go chase grammars that work, and I always run into problems compiling them to WASM with a Rust toolchain. There's a bunch of linker hacks that are required. I'm trying out buck2 to build everything, which doesn't help things at all. It's a nightmare. So I decided to start working on the definitive Rust distribution of tree sitter and tree sitter grammars called Arborium because it's a collection of trees. Um, so I go and find 96 parameters. I figure out a good API and make sure that they all have syntax highlighting queries, and I bundle a bunch of themes and I make a nice landing page for it. I make sure that they're able to compile to WebAssembly by faking all the C functions they claim they need. And I spend forever automating CI so that I can actually release updates to the grammars and the themes and the crates. I make sure that license information, attribution is still there. I get in touch with the crates. io team saying, I'm sorry, I'm gonna be publishing a hundred crates tomorrow. Is that okay? And they're like, yeah, you're already on the list of people who can

Segment 3 (10:00 - 15:00)

do that stuff, which means somehow you've done weird things before. And ta-da arborium. I'm super, super happy about this release. It is already useful to a bunch of people. I hope becomes more people in the future, but for now,. It's just so satisfying to just comprehensively, definitively solve a problem, just never have to think about it again. I say that knowing that still in the back of my head somewhere, there's a voice that says, wouldn't it be nice to just have like a pure Rust version of tree-sitter, instead of having generated parsers in C, and a C core, and everything? And, and it might, it would also be slower to compile. like a lot, a lot of work. Even for me, it might be reasonable if you're willing to give up some performance and give up all the incrementality of tree sitter, then maybe. But I don't need to do that, so I'm not gonna. Meanwhile, I'm working on facet surprise. I decide we need a proper website with proper documentation on there. And what are the options for static website Zola? Everybody loves Zola. It's rust. It's our Emile Zola to their Victor Hugo. Uh, well I have very, very strong opinions when it comes to making websites, both the experience of making the website and what the results should look like. So you become aggravated with little things and little paper cuts, little developer experience problems. And the fact that it's harder than it should be to just make a plugin, right? I'm just looking around for an SSG a static side generator that will just let me make plugins and in Rust, that does not exist. Not really, In other languages, sure in JavaScript, just fucking `eval` it straight into my veins in Ruby, require it, in Python good luck with all those paths. But in Rust, nope... And I'm reminded that. Even if I somehow forked Zola and added everything that I wanted, I'm reminded that even if I did all that, then it still would be pretty average at caching, just like pretty much every static site generator. It would cache things that are stale by accident and it would fail to cache things that it should be And that's just, that just makes me very sad. So I decided to make my own, named dodeca, after the dodecahe dron, a nice shape. And, you know, how hard could it be? Uh, it's just turning markdown into HTML. Well, that part is easy. You get pulldown-cmark. You want syntax highlighting. I just made arborium, super. I want minification for HTML, JavaScript and CSS built in. There are crates for all that I want cache busting. Of course. So now I need to rewrite HTML. So link script and image tags point to cache-busted URLs. We have crates for that. I want image processing built-in: PNGs go in, JPEG-XL, AVIF and WebP are served the browsers. You better believe that we've got either pure Rust implementations or wrappers for the original C/C++ implementations. And at some points I'm 1200 dependencies in and iterating becomes really painful. I'm reminded of the time that I made Rubicon to enable dynamic linking, even if you're using crates with thread-local storage, like tokio, tracing, et cetera. But I do not want to have anything to do with dynamic linking anymore. So what's the next thing? It's not PIC, it's IPC. Interprocess communication is just RPC at home, so I named my thing "rapace", which is RPC, with extra letters and it's just French for Bird of Prey. And I have, again, strong opinions about how to do RPC. I've been doing RPC for a while. There are things that I like and things that I don't like. For example, GRPC do not like, mostly because of protobufs, which have all the downsides of Go with none of the charm, And I figured I have all those different patterns that I want, right? First off, to make iteration easier, I'm gonna have the central app, the hub, be its own binary and every cell around it you put like HTML minification in one cell. You put image compression even HTTP serving in one cell, like text user interface in one cell. Everything is a cell. I have 18 cells right now in dodeca, And yeah, I wanna do this over shared memory because I'm giving up on dynamic linking, but I'm not giving up on performance, you know? Ah-huh. Yeah. So if you have to compress a large image, it's nice if you don't actually have to make several copies of it over the RPC system. So to do this right, you have to set up your shared memory as kind of an allocator buffer pool. You have to keep track of which buffer is owned by whom. When sending the uncompressed image payload, between two IPC peers, you can do that in a zero copy fashion if you treat it as a reference or handle to the allocated memory. And if you use the special memory a allocator, which takes from the shared memory area. Now I'm not exactly there yet, but I have eliminated quite a few copies and I'm doing zero copy deserialization where you get the frame back and then you deserialize borrowing from the frame and you just carry the frame and the deserialized payload together, like you would do with the yoke crate, but without using the yoke crate because it relies on syn. Obviously rapace uses facet-postcard for serialization and deserialization, and that makes it super easy to discover services dynamically and use them, call their endpoints without even knowing about them at compile time. So you can make dashboards to explore services. And there is one in the example binary, by the way. So I did this whole complicated shared memory design. You know, once you have RPC semantics, it's tempting to try other transports, why shouldn't I use that? To have the dodeca dev tools talk to the dodeca dev server to get things like, you know, hot module replacement, except instead of modules, it's paragraphs of markdown getting rendered to HTML. Why shouldn't I use the same thing for different services in my Kubernetes cluster to talk to each other? And so next thing you know, you have a shared memory transport, but also a WebSocket transport, a generic stream transport, an in-memory transport for testing. Of course. Again, there was kind of a rapid growth era where I just added rapace to everything I could think of and it worked more or less. And then I started thinking about implementations for other languages and I figured, okay, I need a proper specification. Enough with just winging it. That's not gonna work in the long term. And that reminded me of something that James, my podcast, co-host, in

Segment 4 (15:00 - 20:00)

self-directed research taught me about this year, which is traceability. Get out here, Amanda. Anyway, James. Okay. I have a topic that's a little out of left field comes from my background in safety, critical, but it is a thing that I wish we had more of and that thing is traceability. And really what I wish is that we had good open source traceability tools, You wanna have a specification and you wanna have an implementation or several, links between the two. You wanna have, like every requirement in the specification has a unique identifier. The rust reference has that now, and then you annotate your code to say this, implements that requirement, and then you cross reference it. You can go through the entire spec and see if everything is implemented in the code and you can do the reverse. You can go through the code and see if all the code is covered by requirements. And you know what? This is very much in the spirit of my year, 2025, but I remember James saying there's no great tooling for it in rust, and I didn't even check. I just went and made my own immediately, which I called tracey with an e mostly, so that people who are looking for the Tracy Profiler without an E get confused and use my software instead. as I'm writing this, uh, because of all the changes I made to the rapace specification, I do not have a hundred percent coverage or reverse coverage. But it's pretty nice to see the tracey Interactive dashboard and be able to see where all three implementations stand. In fact, I have added support for the tracey requirements syntax to dodeca, so that you can embed the specification on your website and just have it be clickable and refer to it, which means you can have IDE tooling that refer to a, spec requirement that links directly to your website. Speaking of dodeca, another very important part of it is the query system. I've been obsessed with salsa ever since I've learned about it. Basically, when you write something like a compiler, you have a bunch of inputs and queries which can read from those inputs or from other queries, which themselves can read. You get the idea and the goal is super simple. Be as lazy as possible. Do not recompute a query unless anything that goes into it has actually changed. And of course, only evaluate the queries that you actually need, that someone actually requested the result of. Salsa is used in rust-analyzer, and the laziness is a blessing and a curse. They had to implement pre-warming because if you don't query anything, it's not doing anything. So the first time you ask for a completion, it's like, whoa, buddy. I have to analyze this entire code base. As of recently, salsa is also able to persist its database to disk, which can be, again, a blessing or a curse because what if the database is huge and now loading it from disk is extremely costly as well. Because I wanted perfect caching in dodeca, I started using salsa, but I fairly quickly ran into the problem that most of my operations are asynchronous now. Even just compressing an image is making an async or a PC call. shared memory to a different cell. Therefore, salsa doesn't work for me because all the queries are supposed to be synchronous. An d that is how I started working on picante which is not a fork or anything, it's just like the same ideas as salsa, but async-first. I had to make a bunch of different choices there. It uses facet for everything, of course, including equality comparison, even if your types don't implement PartialEq, It just does structural equality, which is nice. I did also implement persisting the database to this and even incrementally disk so you don't have a big save phase at the end. I'm sure there's still a lot of bugs in picante. It sounds too good to be true, but overall it's been giving what I wanted, like track a bsolutely everything in doca. There's very little difference between the production build of a website and the development build of a website. The main difference is that in development, we inject the script tag for dev tools, But for both profiles we do minification of JavaScript, HTML and CSS by default. Uh, might wanna make a setting to disable that, but for now it's like that we do, of course, image compression on the fly depending on what you request, which is to say, depending on what your browser supports, And we do something that I've always dreamed of doing, which is codepoint-accurate font subsetting, So there's a query for rendering, markdown pages to HTML. extracting all the code points from HTML and putting them into a set. There's a query for merging those sets together, there's a query for taking the uncompressed font and the set of all the code points for a certain style together and doing the font subsetting. And it's all lazy. Like it only recalculates when you use a character you've never used before. like suddenly you paste in something that has unicode box drawing characters and yeah, it needs to add them from the original font. But Amos isn't font subsetting expensive? You need to shell out the Python to use pyftsubset. No, it's not because I released woofwoof, which is just the build of the WOFF2 C++ implementation. But I packaged it nicely. I made sure that it builds in CI for Linux, Mac and Windows. And then I released a fontcull, which is a Rust version of what was my favorite tool for that, until that point called glyphhanger. The problem is that glyphhanger hadn't been updated in five years and was using a very, very old version of playwright. Old enough that I couldn't download browsers anymore. I just decided, fuck it. Let's do all of it in rust. We have the crates, or do we, because I've been keeping an eye on this for a while, talking about weird hobbies and for font subsetting, there was only something that worked for PDF, which doesn't need the full font information. So it couldn't be used to subset fonts and then use them in browsers. But I also knew that some people at Google were working on a bunch of brush crates around called And the good news is that this year they came close enough that you can use their crates to subset fonts and use them in browsers. And I know that because I vendored their code straight from Git. They're not released on crates. io yet, except as part of fontcull... I hope that's okay. I kept the license. And the result is that if you go on facet. rs, which does use dodeca for the docs, and you look in the Network tab and you filter by fonts, you will see that the Iosevka font being served is 10 kilobytes down from the original two megabytes because the original includes nerd fonts, et cetera. But the point is, if I paste the. Terminal session that has nerdfonts icons. It's just gonna add them magically. It's everything is perfectly tracked and perfectly cached, and I'm perfectly happy. Everything's perfectly wonderful. Speaking of dodeca, since I knew I was gonna use it for specifications

Segment 5 (20:00 - 25:00)

and technical documentation, I wanted to have a way to make diagrams, but I wanted server side rendering. I didn't want to use Mermaid. js or something because client side rendering is bad for page load time. It's bad for accessibility, sometimes, it's bad for page shift. I just hate it. I don't want it. So I looked around for alternatives. I found D2, which is made in Go. Uh, but it's made in Go., and I found Typst, which I bundled for a while to make open graph previews for pages automatically, but dear Lord, It was my heaviest dependency by far. I just got rid of it. Eventually, someone pointed me to pikchr A diagramming solution that I didn't know. and that had a completely self-contained C implementation. A very good candidate for a Rust port. Thing is I didn't really feel like porting it myself. So I essentially set off Claude to port it by giving it tools to compare a hundred test cases, I had it generate an HTML comparison page, so you could see every rendering side by side, like overlaid onion, skin, everything. and that's for me, that's for humans. And then for it, I made it make an MCP that runs a single test and then renders two SVGs to PNGs and attach those as a response. Because those models have vision capabilities. So sometimes it's able to look at the thing and go, oh, the lines are in the wrong place. Whereas if it were to compare SVG, it would just be drowned in all the markup. Speaking of comparing SVG, the Rust implementation of pikchr, which I called pikru uh, also uses facet-svg, which is just a bunch of types defined on top of facet-xml, which I made just for this project. And I worked on a bunch of tree diffing algorithms just to produce diffs good enough that the agent was able to tell, oh, this is what's wrong with the rendering. eventually the Rust port reached a hundred percent parity with the C implementation. So I published everything on GitHub pages with the comparison HTML and everything. Made it look good. And it's at that point that I discovered that actually Claude and GPT and any other AI that I tried sucks writing PIK diagrams. So unless I wanna do the diagrams myself, which I don't really want to, because there's not a lot of auto layout going on in PIK, it's not like Give rust struct produce diagram. Oh my God, we could use facet for this. Somebody stop me. Uh, I decided to have it another diagramming solution. I didn't know that something like SVG Bob existed and already had a rust crate. Instead, my research pointed to aasvg, which is based on a client-side markdown implementation called markdeep. My port is called aasvg-rs, unimaginatively, and it also has parity with the original. It's doing that nice thing that I did in both my ports where it's using CSS variables to get light-dark support in SVG. Little Caveat, this is absolutely non supported by Safari Mobile, which is the current Internet Explorer. More things that happen in the facet ecosystem that I haven't really talked about as I was writing this video, I was also working on the facet benchmarks, and I was like, oh, the JavaScript code for the benchmark browser view keeps getting out of sync with the format generated by the benchmark harness. If only you could use TypeScript types to validate the front end, and if only you could use JSON Schema to make sure the backend is producing what we think it is, and if only the source of truth was just a bunch of rust types, wouldn't that be fantastic? well, obviously there's existing solutions. There's derive macros like schemars, but at some point I promised that facet would be the last derive macro you'd need, and in that case it works. I quickly threw together facet-typescript and facet-json-schema, which made iterating on the benchmark dashboard a lot easier. Other things I was interested in included an alternative to error, but based on facet or something like displaydoc or derives for the miette crate. So you can implement diagnostic without either doing a manual implementation or using something that depends again on syn. And the problem with all of these is that you actually have to act like a macro. You have to generate additional code. It's not just like, oh, you derived `Facet` so you can do whatever you want to run something. Like for thiserror you have to implement Error, Display, you have to implement Diagnostic, therefore you have to generate code. Therefore, we needed to come up with some sort of plugin system for facet something that ideally would reuse the result of parsing your type definitions that facet macros already does, but that is able to use templates to generate different implementations. And we did exactly just that. It is not entirely final and the templates are pretty simple for now, but it works like you don't need another derive micro. You don't need sin. I have no idea what the performance, it's like, I haven't actually measured build times on any of this, but the idea is pretty simple and we shouldn't need too many of these because there's a finite amount of traits that you really want to implement. Uh, for example, I barely bother implementing debug anymore because if I wanna see what's inside the type, I just use facet-pretty. Every trait where performance is not paramount, and that can be reimplemented using just reflection is savings in terms of code and build times final binary size, et cetera. One very cool application of rapace that I made recently is FS Kitty, which has to do with virtual file systems. The SSD that's in your laptop has a real file system on it. Maybe it's EXT4, maybe it's BTRFS, maybe it's ZFS, if you're the good kind of nerd or you know, APFS or NTFS or whatever. Sometimes you wanna access files over the network and then you would do something like Samba or NFS and sometimes you just wanna kind of make up files, like pretend you mounted a ZIP file, for example. And for that you need a VFS: a virtual file system. on Linux, if you wanna make a virtual file system, you can use FUSE, which means file system in user space E, and on Mac you could make kernel extensions, but of course, anything that runs in the kernel must never crash. Otherwise, everything crashes because monolithic kernels won the war. Therefore, therefore. Apple has been trying to kill kernel extensions for as long as they've been a thing, and they've introduced things piecemeal to kind of let companies like Dropbox have their virtual file system without touching the kernel as much. The last piece being FS Kit, which lets you implement the file system entirely in user space, Communicating to the kernel over XPC, another form of RPC, which is great, except you have to package it up as a file system extension, as an. appex bundle, Which registers in system settings when you open the associated regular app bundle. It's not really designed for command line tools, but I saw some people

Segment 6 (25:00 - 30:00)

made something called FSKitBridge, which adds another layer of RPC. That way you can implement your file system in any language and their file system extension connects to your binary over TCP. And you know, it's a little layer cake of RPC that just works. You don't have to worry about the terrible Apple requirements. You don't have to pay them a hundred bucks a year. You don't have to ship an app yourself or sign it. But I, I just didn't trust these guys to make a, a file system extension and for me to install it on my system, even though it's entirely in user space, I don't know, it felt wrong. Uh, so I made my own FS kitty, but using rapace for RPC, so. The app itself, FS Kit, you need an app. Most of the app extension and the app are in Swift. And initially I just compiled a bit of Rust Code and linked it in just to get the rapace support, uh, using Swift Bridge actually, which does support Async. And eventually I thought, okay, you know, I'm, I'm getting hangs, I'm getting crashes. Tokio and Swift is dicey. Wouldn't it be easier to just implement everything in Swift? And that's when I started writing rapace implementations in other languages. Just like, you know, Rust on the front end is fine, but sometimes I just wanna do Svelte and TypeScript and be done with it. Wouldn't it be nice to just have like a native TypeScript implementations of the rapace protocol? As I'm writing this, Fs Kitty is at a stage where it used to work at some point, and then I started reworking all the dependencies. Therefore, it's broken right now, but it's gonna work again in the future. Let me tell you because I need it for the project that is probably the most exciting to me of 2025, and it's gonna carry on into 2026 vixen. Porting my entire monorepo from cargo to buck2 was an eye-opening experience. It's not just me: cargo is pretty bad at caching, at least right now. Things are always being improved, but the fact of the matter is it simply is not designed like a proper build system should be. And "proper" build system is a loaded term, and I have a very specific idea of what it should be. So it's a personal take on this. On my monorepo, a cold build with cargo is 35 seconds. On buck2 it's 25 seconds. A no-op build is almost a second with cargo and 0. 06 seconds with buck2. As for changing a single line in a function deep in my dependency tree, it's 21 seconds under cargo it 8. 5 seconds. In buck2, the numbers speak for themselves. however it is a major pain in the As. To maintain BUCK files for all your crates, especially if you have like a hundred of them like me. Plus maintain fix-ups for some of your dependencies. If only there was a build tool that had the properties of buck2, with the ergonomics of cargo, if only you didn't have to use a separate tool to generate build files for your dependencies. if only, if only it was designed from scratch. To be friendly to the rust ecosystem, but not only because you need to own C/C++ compilation. And while you're at it, why not also own things like JavaScript bundling, any sort of manipulation you wanna do on assets, container image construction. Why not? and if you're doing all this, why not do it with continuous integration in mind? Of course, because that's where caching matters most. I look at workflows that take one and a half minutes for 10 seconds worth of compilation, and I become the joker. I know, I know. I'm complaining. People have hour long CI pipelines. I don't care. My pipeline should take seconds because that's how much CPU time I know is necessary to prove that the change is valid. I have great plans for vixen. It is a hugely ambitious project. It requires hubris, which I have again, now that we found the right meds. It is absolutely not capable of building anything except the most trivial hello world right now. But I'm, I'm going for it. I don't know what else to tell you. I genuinely believe it's possible to get the best of both worlds. The most likely outcome is that I just burn out and never make anything useful with it. But dammit, I'm gonna keep trying because I've been doing hacks around CI build times for a long time, like a decade easily. I have a CLI tool called "timelord" that saves and restores timestamps to try to make cargo stop rebuilding things, and it doesn't work anymore because of nanosecond resolution problems on file systems. I've reached the point where I'm not even gonna bother anymore. I'm just gonna make my own build system. I don't need anyone else telling me that it's stupid and that I'm never gonna make it. I'm, I, I fulfill that job myself. I need people to get excited and make their own, like, why should I be the only one trying? Also, I have no illusion, no intention of replacing cargo. Cargo is gonna be there for as long as rust is, it has the constraint of having to work for absolutely everyone on absolutely every platform and strong backwards compatibility guarantees. I get it. This is why it's exciting to be able to do a clean implementation of something like let's take all those ideas and properties that are nice and try to put them all together. It's a bit like cooking. I like it. I'm building it with remote execution in a content addressable store day one, which is why it's hard to get even the most trivial builds to run. 'cause there's so many moving parts. But that means you're suddenly not worried about target directories, right? like, if you rebuild with or without a cargo feature enabled, is it going to override part of the target directory and cause rebuilds? And if your CI is the same architecture as your local development tool, then by the time your changes reach CI, it's a no OP because you've already done it. In fact, the entire CI pipeline, you can just run locally as your local vixen command line dispatches task to remote executors. You don't need several different jobs defined in YAML. You don't need to temporarily upload artifacts to an artifact store and then download them in the next stage because everything is just in the content addressable store. Every executor has its own memory and disk cache of the CAS. You don't need to worry about dividing build into several CI jobs, so they run in parallel because it's all part of the same build graph. And the orchestrator will build as much as possible in parallel. Of course for that to work, you need to have your build be hermetic for real. You can't just grab any tool chain that's in the environment or just there on the file system. For example, for C/C++ builds, I'm grabbing tool chains from Zig. For rust, I'm actually downloading tool chains directly from static rust-lang. org.

Segment 7 (30:00 - 33:00)

I'm not going through rust up at all. For example, with cargo, if you build on the rustup stable channel, all the rustup 1. 92 channel, even though they're exactly the same toolchains right now, it's going to rebuild because the path changed. that's not a thing. If you have true hermeticity because the tool chain is mounted somewhere, it's accessible through the virtual file system. And, uh, if it has the same hash, hash. It's part of the inputs. Inputs didn't change. Uh, no need to rebuild. The content addressable story is not just a cute gimmick. It's also like, oh, you wanna have the last 16 rust tool chains, okay. There's a lot of deduplication you can do in there. They don't rewrite the standard library every time. You know, there's LLVM tools that will change every release. There's a lot of things can be reused, And if you're using ZFS, it's already doing that for you, and I'm really happy for you. For build scripts with buck2, you have to pretty much either patch them out because they're doing something naughty, or you can run them if it's fine, like if they don't reach for the network or something. You have to make sure that their inputs are in there. You have to manually declare them. There's a lot of things in buck2 that you have to explicitly specify, and I don't like that. Yes, the build actions should be hermetic, et cetera. You should know exactly their inputs and outputs, but also sometimes you can just infer that right. looking at a build script, if the build script is only calling the CC Crate, I'm gonna patch the CC crate. I don't care. I'm gonna substitute a version of the CC Crate that does not actually build C and C++ files directly, but instead creates actions to be dispatched by the orchestrator later. That just makes sense to me. There's so many things, so many patterns in Rust crates that we recognize that a build system could know about if it cared to look. And there's always gonna be the odd crate out, like sqlx, or rustls, yes, that's how you say rustls, that needs special treatment like network access or compiling assembly. And instead of relying on someone maintaining a GitHub repo of all the fix ups, maybe there's just a package manager built into the freaking build system. Maybe, you know, maybe you can just make life comfortable for yourself. What a novel idea, what a concept Friday. And of course, having the dependency tracking that's rigorous to the point where you get perfect caching is extremely useful. Because then you get to see the build graph, you get to debug why something rebuilt, exactly. Which is something buck2 does well with its "explain" command. I'm, I'm gonna steal all of that and again, expand it beyond just the build to be like CI or even, even deployment. Why not? I, with vixen, I'm definitely at the, everything looks like a nail stage, but my hammer is fucking awesome. Conclusion for this recap, I had a ton of fun. This year, I'm gonna have even more next year. I'm at that stage where I'm duck fooding like there's no tomorrow. Every one of my crates uses another one of my crates, and it's absolutely great because I get the developer experience that I want unless things broke and then it's on me to go fix it. But you know, one of my favorite things to say is, I hope it's my fault that this broke, because if it's my fault, I know I can go in and fix it. But if it's someone else's fault, who knows how long until they fix it, All the stuff that I talk about here is open source under the bearcove GitHub org, except for Facet, which is under its own organization, which means you're free to go and play with it all. Don't expect much stability except for facet. Generally, if something becomes usable, I'm going to start making noise about it. I'm gonna have an official announcement on my blog. I'm gonna make a video about it. if I haven't yet, there's a reason. This video is kind of an exception, kind of a teaser of like, Ooh, I'm working on cool stuff. None of it is ready. I am also happy with what I did regarding videos, we had a bunch of videos this year. I worked with two different video editors, Sekun and Vlad. Thanks to them for helping me along this journey. There's more coming next year since videos pay for themselves with corporate sponsorships and whatnot. Speaking of thanks to AWS for a large donation towards the development of facet. Thanks to Depot for all the CI build minutes. Thanks to for the free credits. If you're a company who wants to help sponsor some of these development efforts, definitely do reach out. My email is on my website's about page. I'm looking forward to next year, which is not the way I felt every year for the past 10 years. So, you know, it's nice when things are good. I hope things are good for you too. Take care and I'll see you next. No, see you very soon.

Другие видео автора — fasterthanlime

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник