A Vulnerability to Hack The World - CVE-2023-4863

A Vulnerability to Hack The World - CVE-2023-4863

LiveOverflow

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Оглавление (4 сегментов)

Segment 1 (00:00 - 05:00)

On the 7th of September Citizenlab announced the discovery of a new iOS zero-day getting actively exploited in the wild. Apparently the entry point for this vulnerability was a PassKit attachment. These passkits are basically these tickets you can add to your iPhone wallet, and so the victim probably was sent an iMessage, iOS parses the passkit attachment, which can include images. And the phone gets hacked. Of course Citizenlab had immediately contacted Apple to report their findings, and on the same day of the press release we also got an iOS update. “Processing a maliciously crafted image may lead to arbitrary code execution. ” A few days later we also got an update from Chrome reporting that a critical heap buffer overflow was reported in the image format webp. Reported by Apple and Ctizenlab. At the time the bug report details were still restricted, but it was clear that there was an issue within the image format webp. And finally it was confirmed here, when looking at the webp commits there were changes related to “Fix out of bound write (a buffer overflow), in BuildHuffmanTable. ”. So what we have here is a very critical vulnerability, in an image file format used by iOS and Chromium. And with these two code bases we basically already cover the world. But if you now think, you are safe because you use Android and Firefox? Well. Obviously Firefox is also affected, and so is Android. Image parsing with imagemagick on the web server? Vulnerable. Any software that supports webp, most likely uses the official webp library, and that’s where the vulnerability lies. So I don’t think it’s exaggerated to say, that this was one of the most valuable vulnerabilities that could exist. Looking at the prices for 0days, of course, for a full chain this single vulnerability is not enough, BUT this webp vulnerability could be the entrypoint for any of these categories. This is an insane vulnerability that existed in the source code since 2014. That alone makes it interesting to look at. But this is not the only fascinating thing about it. Because, if it’s a vulnerability that could be used to exploit so many different projects. Why have we not yet seen any full example exploits, for any target? Why is it that we only have some proof of concept. webp files that trigger the overflow, but nothing more?! This is the start of a small mini series where I want to share with you everything I learned about this vulnerability. And in this first video, I will try to explain the general concept of webp and the cause for the overflow. In the second video we will then dive into the question of fuzzing. How can we setup fuzzing and could we have found this holy grail of a vulnerability? And then finally we look closer at the code and try to reason about the discovery process. So let’s start with the first video, and I guess I'd have to summarize it: this video is about “why you should study computer science and pay attention in your data structures and algorithms classes. ” Probably the best article written about this vulnerability is by Ben Hawkes and this was obviously also one of the main resources I started with. So let’s see what he was able to learn from the publi security patch. > The vulnerability is in the "lossless compression" support for WebP. [... ] To achieve this, WebP uses an algorithm called Huffman coding. > Although Huffman coding is conceptually based on a tree data structure, modern implementations have been optimized to use tables instead. The patch suggests that it was possible to overflow the Huffman table when decoding an untrusted image. > [... ] the vulnerable versions use memory allocations based on pre-calculated buffer sizes from a fixed table, and will then construct the Huffman tables directly into that allocation Okay. Sounds like, in order to understand this vulnerability, we need to understand lossless compression and specifically huffman codes. I guess we have to brush up on some computer science algorithms and data structure knowledge. When I want to learn something, the biggest issue I face is the unknown unknowns. There is just so much information hidden, that I struggle to even pose the right question. So usually I try to get a broader understanding of a topic first, because then hopefully I will also start tp recognize what I don’t know. And I can specifically seek out the answers. So I went on YouTube and just typed in “huffman codes” trying to understand the bigger picture. And we can quickly find a video by Tom Scott and then also found these two videos by reducible “Huffman Codes: An Information Theory Perspective” and “How PNG Works: Compromising Speed for Quality”. And specifically the latter one was very helpful to me - while it’s about PNG, it’s not unreasonable to assume that webp, being newer, built upon concepts from something like PNG. Anyway. From the tom scott video we can learn the basics how data can be compressed with huffman codes. And it’s kinda simple. You first perform a frequency analysis of symbols, which means in the case of text we can take each character as a symbol

Segment 2 (05:00 - 10:00)

and count how many times it is seen and create an ordered list of them. And then we can start with the least common characters (or symbol) and start building a tree structure. The character 0 and 1 is only seen once, so together they make up a small subtree with their sum 2. Then we have character M with two occurences, so we can combine them into a new sub tree. Their sum being 4. And now we keep going until we get a complete tree. The tree can be used to compress data, but let’s talk about the decompression case. Let’s say we transmit this tree, and the compressed bits 11000, then the receiver can decode it. We start reading the bits and a 1 means right tree, and 0 left. So right right, left, left and we reach the symbol W. Wich means, we successfully compressed the ascii character W, usually encoded in ascii 8 bits, down to just 5 bits. So that’s huffman code and huffman trees explained. And then from watching Reducible’s videos, going a bit more in depth and learning about some clever ideas how an image could be compressed, most important to me was the fact that PNG compresses each color into its own separate huffman codes and tree. Because I have seen read this about webp as well. So we might even talk about multiple trees, a tree for the red, blue and green pixel values. And well, after watching these videos, I felt like I understood the basics of Huffman Codes and how this lossless compression relates to an image file format, but I was still confused about the Huffman tables. All the resources just talk about huffman trees, but the vulnerability is about overflowing the huffman tables. So I was asking myself, where did I see “tables”? First I thought it must be the symbol frequency table, how often a symbol appears? And looking at some huffman code homework exercises, we see such a table often created. But that didn’t make much sense to me, for two reasons. First of all, a table with a list of symbols is unlikely to overflow. We know how many symbols there are. Color values go from 0-255. 8 bit. So we know this encoding table must always have 255 entries. How would that overflow? And the second reason I thought was weird was that, a table like this is not really efficient. Do you have to loop through table for every bit to find the matching code? This is especially inefficient because the codes can have different lengths. Depending on how deep the tree goes. Something didn’t add up. So I kept researching more. Maybe also asked ChatGPT. And eventually I stumbled over this stackoverflow article. This one cleared up one confusion about the efficiency of the table. Let’s say we have these codes. Basically this is how the tree would look like. A is straight to the left. B is right left. And so forth. If we want to put this into an efficient table, we just create a list of 3bit indices. 3 bit because it’s the longest one. And now when we decode data, we always take 3 bits. Literally use that as a number offset into the table. 010. This line. You can see the a is expanded to many rows, because we know only 1 bit is required for the a. The 0. So we essentially ignore all the other ones. Meaning we can slide it over by one bit, to the next 3 and we get 101. That's a binary 5, so we can go straight to that table offset and it’s a b. Consuming two bits. And when I read that, I started to feel like I understand why this table could overflow. The length of this table depends on the bits of the longest code. And so “a” in this table was not just one entry, it got expanded to all 3 bit codes starting with 0. So if we could provide input that leads to a huffman table with very long bits, the table could explode in size. That sounded logically, and it’s kinda the idea of this vulnerability, but this cannot be the full story. It would be easy to craft a tree so that the maximum code length gets longer and longer. They would never be certain about a maximum table size. And when reading the vulnerability description, this simple explanation didn’t explain it. There must be more complex logic behind it: “The purported maximum size for a symbol size of 40 with a root table of 8-bits and a maximum code length of 15 is 410”. Or also the comment in the source code: “fixed table size values computed for 8-bit first level lookup” What? How does that fit into our table understanding? I had to ask somebody, but eventually, after talking to @mistymountancop on twitter, it clicked for me. I found the missing link. They wrote: “Each huffman lookup table can have 2nd level table to prevent the table from becoming too big. ”. So this links our understanding of these tables growing, but also that the developers are aware of that, that’s why two layers of tables are used in practice. they also shared this

Segment 3 (10:00 - 15:00)

article explaining this more: “The gzip approach is to introduce a special marker on these "too long" entries that tells the inflating code that the entry is a pointer to another lookup table”. So you just split up the tables into two tables. You prevent the size from exploding, and still get the speed of direct table offsets. And that’s why it’s used in practice in various places, including webp huffman tables. And if we now go back to the webp source code, we can finally understand where the hardcoded table sizes come from. Apparently it’s the result from running thiss tool called “enough. c”: “All values computed for 8-bit first level lookup”. Let’s run this tool with the parameters as mentioned here. So the first table, used for example for the color red, will cover 256 symbols. Those are our 256 different possible color values. Then question, how many bits for the root table?, the first table?, this will be 8 bits. But can the second table still explode in size? No, we only support a max code length of 15. These are our parameters and this tool now performs an exhaustive search, so a bruteforce, to find the maximum possible table size when compressing data. In this case we get the result, 630. So we have 630 as the table size for red, blue and alpha pixel channels. 3 times. And then we also have a distance value, but that only has 40 symbols, which results in a size of 410. There is one color channel still missing, the green channel is special, because it depends on some color cache size, no clue what that means, but it’s a bit larger. 256 symbols for the color + 24 length prefix + color cache size. And that’ s how we get the 654 here. And the others are then respectively for color_cahce_size 2, 4, 8, 16, and so forth. Anyway. After seeing that I kinda understood some core facts about the webp implementation of the huffman tables. This fixed table size is not just the size of one huffman code table. It covers 5 different huffman trees: for Red, blue, alpha, distance and green plus color_cache. And each of the five huffman code tables consists actually of two different tables - the root table and the 2nd level table. And they are ALL managed and allocated in ONE big memory chunk. Let’s peek into the code. Here we can see that the kTableSize[] array is used based on the color_cache_bits to determin the maximum table size. And then later this one huge table is allocated. And passed into ReadHuffmanCode. Here a comment says that the “code_lengths[]” array is used to create the huffman tree (or the huffman table), so keep that in mind, that array will become important. But let’s first follow the path of the table buffer. This table is then passed to VP8LBuildHuffmanTable(), along the code_lengths buffer, and then it calls BuildHuffmanTable(). And in here is where the heap overflow finally happens. But how is this table now constructed? Let’s think about this. The table represents a huffman tree, right? Each image that we compress will have different pixel values, or symbols. So the huffman tree shape depends on the frequency of the symbols in the input. And the tree shape then determines the huffman code bit lengths. And these bit lengths then determine the table size. If we know there is one symbol with bit length 1, one with bit length 2. And two with bit length 3. We know symbol one requires 4 entries, symbol two requires 2 entries. And the other require one. This means the code length information is all we need to construct the table! And maybe you already guessed, this is exactly the role of the code_lengths array. This array simply stores for each symbol the code lengths. And if we look at where this data is actually coming from, we see that it basically comes straight from the. webp file. here this array is created, and we can see it’s using VP8LReadBits, which is the helper function to read bits from the file. There is another huffman table involved I believe to compress this array as well, but not that important. Let’s see how this array is used to build the table. There is a first step where we loop over all entries of it, constructing a new array called count - which has 15 entries, that is the max allowed code length. As we know, each entry in the code_lengths array is representing a symbol and their huffman code bit length. So if we for example look at the table with 40 symbols, it will be 40 entries long. So this loop now goes through each symbol and looks at the length of the huffman code for this symbol to increment a counter how many symbols we have of a certain length. Let’s say symbol 1 has length 2, the huffman code has two bits, then we now increment the count array at position 2. Symbol 2 maybe has 5 bit huffman code, so we increment the count[5]. You get the point. The result is a count array with 15 entries, so it tells us how many symbols have a certain bit length. And this count

Segment 4 (15:00 - 17:00)

array is then used to construct the table. - At least that’s what I think this algorithm does. To be honest I haven’t really tried to understand this code. But I’m pretty sure. And now, Pay attention. Because here comes the mental AHA! moment for me. I don’t think it’s obvious at first, but after spending countless of hours trying to understand this, I realized something. There is a huge difference between creating this table or array during compression, and now creating it again during decompression. Hold my mate, let me show you. Let’s imagine an invalid table where 4 symbols have bit length 1. The tree representation of that makes no sense. You have one bit 0, and 1. You could encode two symbols, but you cannot do the other two? Obviously such a tree does not exist, right?! BUT in the code_length array, in the. webp file, we could encode that. We could say there is a symbol 1, 2, 5, and 7 and they have bit length 1. Which is leading to a nonsensical count[] array. According to this example, we have 4 symbols with length 1. And that can never exist. And yet, we use this value to construct the tables in memory. And now you can get a feeling for the problem. The precalculated, hardcoded table sizes, are based on input to be compressed, in that case we always generate valid trees and thus nice tables. that’s why we can use the tool enough. c to calculate the maximum table size that could ever happen. BUT on the decompression side, when we try to create our table from this data, we can create nonsensical tables. And they might be larger than expected? Mmhhh… Nice theory. But how can we figure out how to craft the values for such a table? Well, maybe your next question is. If these tables are directly constructed from the image file. And I’m sure a fuzzer will throw lots of invalid table information at the file. Could we find this vulnerability with fuzzing? Let’s talk about that next video. If you liked this video, please checkout hextree. io. This is an online training platform developed by stacksmashing and me, and we are trying to create the best video tutorials to learn hacking. It’s a ton of work and we are slow. So we are not fully public just yet and still working on our catalogue, but we are getting there eventually. So I’d appreciate it if you checkout hextree. io. And if you would like to support LiveOverflow videos directly, checkout my font at shop. liveoverflow. com. It’s definitely a shitty present to yourself or people you hate! Besides that, maybe consider becoming a patreon or YouTube member. The financial support is very appreciated. Thanks and see you soon.

Другие видео автора — LiveOverflow

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник