The case for sans-io

The case for sans-io

fasterthanlime

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Оглавление (5 сегментов)

Segment 1 (00:00 - 05:00)

this video is sponsored by Google security as is now tradition this video is also available as an article on my blog if you're a patron €1 per month or above you can go read it right now if not you can just keep watching the video for free and get access to the article 6 months from now so you can code listings you copy paste the things and go your pce and today we're going to talk about the case for S IO the most popular option to decompress zip files from the r programming language is a crate simply named zip at the time of this recording it has 48 million downloads it's full featured supporting various compression methods encryption and even supports writing zip file however that is not the crate that everyone uses to read zip files some applications benefit from using a synchronous IO especially if they decompress archives that they download from the network such as the case for example of the UV python package manager written in Rust UV doesn't use the zip crate it uses the async zip crate which is maintained by a single person and gets a lot less attention this situation is fairly common in Rust the same code gets written against synchronous interfaces and async interfaces this results in a split ecosystem duplication of effort and of course more bugs overall and that's a shame because there's a lot of things about dealing with the zip format that are completely non-trivial it is an old crafty format with a lot of edge cases even though there is an ISO standard for the zip format and most of it is described in the freely available pkware app Note there's still a lot of surprises to be found when looking at zip files in the wall like I did when I worked at. io the zip format predates the universal adoption of utf8 do not tell me that Windows still uses utf16 I'm trying to ignore that fact right now and also they have a utf8 code page right now so the zip format predat utf8 and that means the encoding of file names in zip files used to be whatever code Patriot system happened to be set to only in the Year 2007 was the app Note updated to document extra field values that indicates that the file names and file comments are actually encoded with utf8 this was probably fine when you pass zip files on floppy desks from one office to the next in the same country but at isio we had a situation where a Japanese game developer used the built-in Windows zip creation tool from Explorer and had file names encoded as shift JIS a successor of JIS x201 a single bite Japanese industrial standard text encoding developed in 1969 the year we went to the Moon most zip tools however treated that file as if it was encoded with code page 437 the character set of the original 1981 IBM personal computer you know the thing that gave the initials to PC like PC and Mac which to be fair is a pretty good guess in the west if the utf8 bit flag isn't set because the format only tells us whether a file name is utf8 or not utf8 the solution I came up with so that the hiio desktop app could install games from all over the world is to take all textual content from the zip file so you have file names and comments and do statistical analysis trying to figure out what the character set is based on the frequency of certain bite sequences like these for shift JIS this gives us a list of probabilities and then you just take the highest and hope that you guessed right I'm not aware of any other tool that bothers doing that I think if I had to do it again I would just require a standard archive format instead of trying to extract whatever stuff developers which show in the file upload dialogue of. io that's not the only crafty part of the zip file format for example it doesn't really make a difference between files and directories simply have length zero and their paths end with a forward slash what about Windows well first off I don't know if you knew all windows API support using forward slashes as a path separator and secondly this is one of the things the app node is very clear on I'm going to quote here the path stored must not contain a drive or device letter or a leading slash all slashes must be forward slashes as opposed to backwards slashes for compatibility with Omega and Unix file systems Etc just to give you an idea of like when the zip file format was designed Omega was on their mind of course if the zip was actually created on Unix then the entry would have a mode and in the mode bit you can tell whether it's a directory a regular file or a symbolic link in the wild I've noticed symbolic links tend to have their target as the contents of the entry but of course that's not what the app node says the app note says in the Unix extra field there's a variable siiz data field that can be used to store the target of a symbolic link or a hard link emphasis on can be used because there were so many different tools that could create zip archives and standardization only came later with the iso standard which mandates utf8 file names by the way the appnote takes a descriptive rather than a prescriptive approach it simply documents the various zip format

Segment 2 (05:00 - 10:00)

implementations found in the wild without making value judgments about the choices made by different software author which makes it a terrible standard so if you want to support most zip FS out there you have to be able to read th style time stamps and Unix style time stamps which are completely different dust time stamps for example are completely Bonkers they fit in 32 bits half for the time half for the date so far so good the day is a 5 bit integer the month is a 4bit integer the year is a 7even bit integer counting from 1980 mind you and as for the time is stored in hours minutes and 2 second intervals I think of it every time someone says that I e754 is weird because doing 0. 1 plus 2 shows a lot of decimals after3 or whatever but okay fine those are details you can probably ignore for files that have been created with recent orane tools I don't like using that word but even the most basic fundamental aspects of the zip PL format are slightly cursed like most file formats start with a magic number and then a header including metadata and then the actual body the actual meat of the file so pixel data for an image or vertex coordinates for a model things like that but not zip the only correct way of reading a zip file is to start from the end of the file and walk back until you find the signature of the end of central directory record that's a mouthful get used to it and that's why if you take a look at the zip crate API it requires the input to implement both read and seek because even just to list the entries of the zip file you need to be able to move around it doing the this property is not as simple as it may sound originally the zip crate made four byte reads starting from almost the end of the file and then moved left by one bite every time it did match the signature of the end of central directory record which was hugely wasteful the async zip crate which was written later improved on that by making reads of 2 kbytes and moving to the left 2 kbytes minus the size of the signature to handle the case where the signature would overlap two buffers which is pretty smart the comments mention a 500 times speed up compared to the zip method the zip crate eventually caught up in May of 2024 by doing 512 byte reads which temporarily made it much faster until August of 2024 when they fixed a bug in the end of central directory record fining logic a pretty fun one actually let's dig into it most file formats reasonable file formats have some sort of framing mechanism you read the file moving forward and then you have records prefixed by their length MP4 or rather M 4 part 14 calls those boxes media authoring software tends to write a lot of metadata that media players don't necessarily know about but anyone can skip over those boxes even if they were completely unknown type this property also makes it impossible to mistake data from the actual structure of the file each box has a type and the type can be a valid utf8 bite sequence there's never any ambiguity as to whether you're reading the type of a box or name of the author of the media file for example however in the zip format because you're scanning from the end of the file going back wordss it is possible to read part of a comment or file path and have it accidentally match the signature bites for the end of central directory record and that's the bug that was fixed in the zip crate in August of 2024 instead of stopping at the first thing that looked like an eocd signature then I keep scanning the entire file all the way back and keep track of all the offsets at which signature like things were found and if you like finding bugs you love getting paid for it once in a blue moon the interest of a giant Corporation aligned with the public today is that Moon Google security has launched a patch Rewards program to incentivize proactive improvements to Security in open source with three main action one Harden existing CNC press code bases separate privileges sandbox stuff Harden your memory allocator and adopt patterns like the safe buffers programming model two build the scaffolding required to integrate memory safe components into existing cn+ product if Firefox chromium and Android have done it you can do it three take existing rust crates with unsafe code and either reduce the amount of unsafe code or encapsulate it into self-contained safe abstractions I am extremely excited about this program because it's not about doing the bare minimum to patch a specific B bab it's about raising the bar for memory safety in critical pieces of software all at once and getting you paid to do it reward amounts range from $ 100 to $45,000 there's a three contribution limit per month patches must have been merged for at least a month with no reverts and you can't submit anything more than a year old memory safety improvements have a 2X reward multiplier until the end of 2025 3x if it's done to one of the tier one project scoped as core infrastructure data parsers which include a bunch of image audio and video decoders along with FFM Ty itself will someone please Harden enough FEG other categories of tier one project include engine X open SSA live uring and lives it head over to g. co PRP to be beat rules and to your first pack thanks again to Google security for sponsoring this video and security for everyone but of course reading an entire multi- gigabyte file by increments of half a kilobyte seeking backwards every time is pretty much the worst possible

Segment 3 (10:00 - 15:00)

read pattern that you can do on any kind of device any buffering done in userland or in the kernel is woefully unprepared for that and I was going to give you the example of a 4 gbyte file I would require 8 million cisal just to find the eocd but then I stumbled upon this comment in the gab repository I tried this PR on a 200 GB zip file 233,000 wrong technically but maybe morally by the way if you're confused about all the complexity like what's the big whop in finding this thing remember that you can have garbage at the beginning of a zip file or at the end of the zip file or both and most tools will still be able to decompress it just okay self- extracting zip fils actually start with a native executable and the zip file is just tacked on at the end for example in December of 2024 as I was rewriting this piece APR lended after 11 weeks of back and forth that rewrites the eocd detection algorithm again fixing the huge performance progression introduced in August is the async zip crate impacted by any of the bugs that were fixed and then refixed in the zip crate probably it was last released in April of 2024 so who knows I didn't check because I have my own ZIP crate RC zip which I believe to be the best of the three not just because it also does character set detection like I explained before but because contrary to the zip crate or the a zip crate it is not tied to any particular style of IO also it has a very cool logo by the exceedingly talented Misha there is ample precedent for Sen iio approaches and I'm very happy to credit zi of num Fame for encouraging me to take that approach 5 years ago when I started working in RC zip there's examples of senio in the rust ecosystem already the rust LS crate comes pretty close although it still somehow ties itself to the standard read and write trait the consumer of the library is free to choose when to call read CLS and write TLS which means it integrates seamlessly with a completion based Library like Mi the senio pattern is even more common in the C ecosystem because well they have no standard IO interface you could have your apis except a file descriptor but that would be fairly limiting no templates of FY pites like Java and C++ program and c and at the Z standard decompression API for example it looks like this you get a handle you get an output buffer an input buffer those buffer are simply pointer size and position and when you call decompress stream it updates the pause field of the input buffer and the output buffer and then based on that you can determine what happened if the input's position is less than its size that means an only part of the input was used during the call and the rest should be passed again to the next call this can happen if the decoder didn't have enough space in the output buffer for example if the output's position is less than the output's size it means that the decor is completely done and has rushed all remaining buffers on the other hand if the output's position is equal to the output buffer size that means you should call it again with more output buffer all these states are surprisingly tricky to get right the decompressor might need more input and you may have no more input to give it that could easily result in an infinite Loop instead you should have a way to signal that you have no more input to feed it and that it should error out if it thinks the input is truncated well RT zip M crates does the same thing except things are a little bit more complicated because remember the first thing we have to do is to scan backwards from the end of the file and after that we want to be able to extract individual entries from the zip file in any order skipping over some going back it's pretty far from just a linear SC to achieve this it exposes two State machines archive FSM is used to read the central directory returning an archive and from there you can build entry FSM to read individual entries knowing their offset compression method Etc driving the archiv FSM to completion involves following a simple Loop first we call wants read if the machine wants more data it returns some with the offset of where in the file it wants us to read most of the time that follows the last readed we did but not always if it did return some we call space which borrows its internal buffer mutably rust doesn't deal with raw pointers so we get a slice back which means we know the maximum amount of data we can put in there once we've performed a read we call Phil indicating how many bytes we've read as with the standard R trade a read of size zero indicates end of file finally once we've fed our machine we can call the process method and I'm fairly happy with the design here because it consumes the state machine you can see it doesn't take m% cell for M mute self it takes ownership of self if it's done then it Returns the done variant of FSM result and we can never accidentally call any other method on the state machine again because it's gone but if it's not done if it wants more input and we should go around for another turn of the loop then it Returns the continue variant yielding back ownership of itself to the consumer now we could of course go deeper into type safety with type States but I'm

Segment 4 (15:00 - 20:00)

fairly happy with the current design which plugs fairly easily into both synchronous iio via RC zip sync and asynchronous IO via RC zip Tokyo well I say that but the RC zip Tokyo implementation is actually fairly messy because asynchronous file io on Linux is a mess you want to know how Tokyo does an asynchronous file read on Linux it does it with a background thread if you go look at the sources you're going to see spawn blocking I think of that every time someone blogs about how reading a file with Tokyo is slower than with the standard Library like no look at all the work it's doing also this is just for files by the way it's not for TCP sockets which is you know how Tokyo is actually supposed to be used just reading one gyte from Dev Ur random with Tokyo and with libd a terrible test program we can see a difference in performance the sync operation is consistently faster on a Linux server M the actual numbers matter very little what's interesting is digging in with lurk an estrace like tool written in Rust also did you know the sras logo is in ostrich now you do his name is Strauss with lur we can observe that the async version is doing a lot of this it's making 128 cubes reads from one thread and then wakes up another thread which cues some more work and so on and so forth doing that dance 8,000 times over the course of the program by comparison the synchronous version simply does this one single Majestic 1 gab read CIS but it's not Tokyo's fault not really there simply was no good way to do acing file reads on Linux until iio uring came around if we change that again terrible test program to force it to the reads of at most 128 kytes which is what Tokyo does anyway and we add a Tokyo uring variance we see that it is consistently competitive with the sync version and consistently faster than the classic Tokyo by about 10% I'm not giving exact numbers because I'm frankly ashamed of my setup and you could tune the numbers to make it say what you want what I do want to show you is the read Loop of the Tokyo uring version in terms of cisal instead ofate it calls iio uring enter to submit the read operation then eepo waits to wait for some operations to be completed and writes to wake itself up because that's how Tokyo channels and wakers actually work here's part of a stack Trace you can see Tokyo uring going into Tokyo going into Mi going into right when submitting Ops that's how asynchronous cyss are called in I uring parin Tokyo uring keeps a Waker around as we can in their life cycle en them that Waker really is just a boxed trade object in this guys it has a data pointer and a virtual table we've seen that recently lb has limited support for us but we can still print the local Fu and see kind of what's in there and we see two pointers one to pointer that's the data and went to V table that V table contains clone wake by reference and drop functions and what Tokyo actually does when you call wake by ref is up to the Mi crate which on Linux uses event FD and an API that allows applications to create file descriptors just for the purpose of signaling events which is cheaper than using a pipe it can be multiplexed via eil just like any other file descriptors like regular files and network sockets Etc this kind of overhead of mixing like eil and iing is why some folks chose to make their own runtime entirely separate from Tokyo uh the data dog folks made glomo the bite dance folks made mono iio vertex click made nuclei there's no shortage of interesting work going on adding a mono iio variant to our test program which is pretty simple shows that the hot Loop becomes just iio uring enter it is however very important to note that this isn't actually a benchmark actual benchmarks barely indicate anything about the performance of real world systems but this test program didn't even attempt to indicate anything we were just poking at various systems to see how they worked all that said I think mono iio looks promising so to cap it all off I think we should make an RC zip mono iio package just because we can we keep it simple and try to implement a single Asing function taking a reference to a file and returning it an archive or an error the file type here is from monoo and so it comes with a native read at method but it has a signature that departs from the usual Tokyo stuff that function takes ownership of the buffer and returns it even if the operation failed that is a requirement for a memory safe iio uring interface in Rust Tokyo uring does the same thing it prevents the buffer from being freed before the operation completes or is canceled it's like we're giving ownership of the buffer to the colonel it was an excellent P99 comp talk about that recently by me and my cat shlock so if the Sleep completes first and we throw away the read future then as far as the type system is concerned we're good to mutate the buffer again or free it but the colonel also thinks it's fine to write to this buffer whenever the read finally completes and so we've got to use after free here our code is no longer safe which means we've messed up the type modeling of our program that API makes the structure of our code a little

Segment 5 (20:00 - 24:00)

peculiar first off our buffer is not evacuate we don't need to track capacity and length separately and we don't need it to grow so we simply have a boxed slice of u8 instead of 256 kbes fully initialized maybe un it is out of scope for today after finding out the size of the file we create the state machine and enter the loop in the loop if the machine wants a read then the first thing we do is calculate how big of a read we can make we don't want to read more than what the machine has room for but also we can't use the machine's own buffer due to the current RC zip apis it only lends us its buffer mutably it doesn't give us ownership of it so we can't transfer ownership of it to the kernel we will need to read it into our own buffer and then copy it into the machine's buffer so the maximum read size is the minimum between the size of our buffer and the size of the machine's buffer once we've established that we can obtain a SCE mut box u8 a type provided by Mono iio it's like a slice but it's owned and it'll make sure that we don't read too much data and as promised we get the buffer back no matter if the operation was successful or not so first we propagate errors and then we copy to the machine's buffer however many bytes we've read letting it know how much that was with its fill method and finally we can take back ownership of our buffer which is stashed inside the slic mud we got back from read at so we need to call into inner to unwrap it and this explains why buff is a mutable binding in the first place we were able to move out of it during a loop iteration on the condition that we put it back before the loop Loops if we didn't the rust compiler would gently but firmly refuse to proceed and here's a program that uses it again if you're a patron you have access to that on the side you can go and mess with it the program runs on Mac OS with the Legacy mono iio driver and it runs on Linux with the iio uring driver which is what we're really interested in we can see that from the iio uring setup call to the printing of the file this things there is not a single read or write CIS called it's all happening as uring up the only CST gos we do see are BRK map M remap M unmap which are all related to Heap allocation we talk about Heap allocation in the 14th part of the series making our own executable Packer which you can check out on my blog it is available for free for everyone right now the implementation of the other state machine entry FSM is left as an exercise to the reader it's simpler in a way cuz the reads are linear and also more complicated in another way because it actually streams data out as the file is decompressed but you only need to implement it once and then it get support for all the compression methods supported by RC zip including the flate bzip 2 Al Ma and Z standard although there are other avenues being explored to avoid that sync and async Chasm like keyword generics I believe the way forward is to Simply Implement formats and protocols in a s IO way I think that unifying lib STD in Tokyo is the wrong approach because neither interface is compatible with modern iio apis like IO uring you know making API urine friendly involves rethinking a lot of things actually and it's Fant to see that other ecosystems that don't have any standard iio abstraction like C or ecosystems with a much higher level of abstraction like node. js have been faster at adopting iing than something like rust where a lot of code was written against a different less flexible model see I can say bad things about rust I'm not a shill probably she packed my bags last night pre-light z hour 900 a. m. and I'm going to be high as a kite miss the Earth so much Miss my wife it's blow me out in space on such a Time

Другие видео автора — fasterthanlime

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник