Deep image reconstruction from human brain activity (Paper Explained)

17:24

Deep image reconstruction from human brain activity (Paper Explained)

Yannic Kilcher 25.05.2020 21 185 просмотров 591 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Can you peek into people's brains? Reading human thoughts is a long-standing dream of the AI field. This paper reads fMRI signals from a person and then reconstructs what that person's eyes currently see. This is achieved by translating the fMRI signal to features of a Deep Neural Network and then iteratively optimizing the input of the network to match those features. The results are impressive. OUTLINE: 0:00 - Overview 1:35 - Pipeline 4:00 - Training 5:20 - Image Reconstruction 7:00 - Deep Generator Network 8:15 - Results Paper: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006633 My Video on OpenAI Microscope (what I called Atlas): https://youtu.be/Ok44otx90D4 Abstract: The mental contents of perception and imagery are thought to be encoded in hierarchical representations in the brain, but previous attempts to visualize perceptual contents have failed to capitalize on multiple levels of the hierarchy, leaving it challenging to reconstruct internal imagery. Recent work showed that visual cortical activity measured by functional magnetic resonance imaging (fMRI) can be decoded (translated) into the hierarchical features of a pre-trained deep neural network (DNN) for the same input image, providing a way to make use of the information from hierarchical visual features. Here, we present a novel image reconstruction method, in which the pixel values of an image are optimized to make its DNN features similar to those decoded from human brain activity at multiple layers. We found that our method was able to reliably produce reconstructions that resembled the viewed natural images. A natural image prior introduced by a deep generator neural network effectively rendered semantically meaningful details to the reconstructions. Human judgment of the reconstructions supported the effectiveness of combining multiple DNN layers to enhance the visual quality of generated images. While our model was solely trained with natural images, it successfully generalized to artificial shapes, indicating that our model was not simply matching to exemplars. The same analysis applied to mental imagery demonstrated rudimentary reconstructions of the subjective content. Our results suggest that our method can effectively combine hierarchical neural representations to reconstruct perceptual and subjective images, providing a new window into the internal contents of the brain. Authors: Guohua Shen, Tomoyasu Horikawa, Kei Majima, Yukiyasu Kamitani Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher

Оглавление (6 сегментов)

Overview

hi there today we're looking at deep image reconstruction from human brain activity by Gua Shen tomoyasu hotei cava ke majima and yukia's ooh cami Thani this is like reading thoughts so I was excited when I saw this paper I saw this on reddit and it is a bit older it is from the beginning of last year so I'm sure there have been developments in this area but basically what this paper does is it will have a human look at a picture as you can see for example right up here it will measure the MRI activity then it will use a what they call a feature decoder in order to map that MRI activity to features of a deep neural network and then they will reconstruct the image that is closest to those features in the neural network and by reconstruction they get basically they get out an image of what the human sees so we're going to explore this pipeline right here but it's pretty cool and if it works you know it basically means that we can read someone's thoughts but of course there's going to be issues and problems so first of all this is all visual they measure the activity in the visual cortex right here so let's break

Pipeline

it down to the individual parts here the fMRI that is a machine we cannot control right so you measure the fMRI activity and that's basically measures which of these cells in your brain use oxygen it's functional fMRI it's not structural so it measures which ones are active and that's how you would see which parts of the brain brains are active and I think the resolution on these things has gotten very good so you can make out very fine-grained activation pattern in the neurons and they measure the visual cortex which is the part responsible for basically visual stimuli so for seeing things now they need this feature decoder but because ultimately what they want to do is they want to have these features correspond to features in a neural network this DNN here is a is like a vgg I think a vgg 16 or a vgg 19 it's called a vgg 16 network this is a couple of years ago these architectures were very popular for imagenet and they're fairly basic so what that means it's not like a super duper inception net where you have like layers within layers and so on they're pretty straightforward convolutional neural networks with nonlinearities and pooling here you can see there is a bunch of layers then there's pooling there's pooling and so on so what you want to look at are these layers of the deep neural network right the individual layers and you want to basically put an image right here into the neural network and then observe its features in you want to put the same image through the human and then you observe the MRI features and you know that this is the same image so you can basically learn a feature decoder this is going to be another sort of machine learned maybe a neural network I haven't actually read this could be just like a linear regression or a neural network just a regression from that map's the fMRI to

Training

the features so this is what you have to learn basically they took a bunch of humans stuck them in an MRI machine right cut out their fMRI data for the same image so for a given image X they got the human fMRI data and they got the vgg the gg they got the features when they put the X the image through the neural network and now they learn a function that minimizes the error so there is an error so basically they run this on a test set of images and then you end up with a function that can basically map fMRI features to neural network features now the second step now you can you can give the human let's say this works an arbitrary image and it the human will basically interpret it through its visual cortex you measure the activity in the visual cortex and then you can predict the neural network features if that image were given to the neural network okay but now you don't give the image to the neural

Image Reconstruction

network instead of what you do something like deep dream does that means you start from a noisy image at the beginning and you try to find the image so you start from this noisy image right here and through iterative gradient descent you refine this image that's this arrow right here you've try to find the image that as closely as possible in the internal representation matches the features that you predict from the fMRI signal because these are the features that the neural network should see right if your feature decoder is good these are the features that the neural network should output for that image so you're basically trying to find this image right here but you're not looking at it you only look at the features that should be in the neural network so after a bunch of steps of refining that image you hope you basically end up with an image that corresponds to these features and then you you can look at it and it usually looks something like this we're used to these kind of things from neural networks and if I invite you to look at something like the open AI Atlas if you want to understand how this is done but basically we can get the image that most faithfully corresponds to these features now this doesn't always work super well because there are actually since the neural network is sort of a dimensional reduction technique there are many

Deep Generator Network

images that correspond to the same features and they often end up like really weird so what they do is they have this deep generator network as a prior basically this is D generator from again from a generative adversarial Network so this network right here is really good at just producing naturally looking images and now because we have that our task is not going to be to start from this image right here our task will be to do the exact same thing but with the input to the deep generator Network so basically what we're trying to do is we're trying to find the input vector to the deep generator Network such that these features right here correspond to the features that we predict from the fMRI activity and because the deep generator network is trained to produce natural-looking images this will always give us sort of a natural looking image no matter what our input vector is so there by week basically constrain the optimization procedure to only output natural-looking images so let's see how well this works

Results

so these are the reconstructions I have to say their training set their training and testing set so up here this procedure where we learned right here this would be done on a training set of images and then they expose the humans to testing set of images so this reconstruction right here would happen on images that the feature decoder wasn't trained on but the humans are looking at it so the humans are would be looking at the picture here on the left and then this would be to the right you'll see as more and more iterations of this process of reconstructing happens the image gets basically clearer and you can see on the right you get pretty good looking images for the ones on the left now these researchers they tend to they tout the success like they say ah look at this but you know honestly like this is a leopard this is a dog sheep this is an owl so you know this is a fish and this is a like a shell like a muscle so I'm and you can go through these and basically you know this is a sled and then this is a truck so they go basically and they say wow the accuracy is really good so they do a pixel correlation accuracy which is you know you just try to pixel correlate things but they have human judgment the accuracy via human judgement is over 95% that's crazy but how do they do it so basically they tell you okay see this is the image at the beginning no sorry they give you this image right here as a human Raider and then they give you two other images they say okay here are two images let's say these two which one did it come from and if the human can determine it correctly counts as a hit so the baseline probability here is 50% so basically right is so this right here is it rather than owl or is it rather the VCR and I mean in that respect it's pretty impressive what you can read from a brain but in no way like zero way this is reading your thoughts this isn't like it seems to basically just reconstruct a example from the image net training set and the imagenet explorer is down right now so I try to look at this but it seems to me it's just kind of reconstructing something it knows that sort of RIT resembles the image on the left yeah but it is not reconstructing that image not at all like look like a bit but vaguely but they do some more investigation into this okay so well first of all here you can see what happens without this deep generator network so when you have unconstrained search then it is even worse right you get like big pixel meshes right here so you need this kind of prior over natural images but the prior I think the prior here comes through a bit much because I the prior might be in part responsible for why the images just show something else then you see they go into an investigation of if and they discover if you use more layers to reconstruct the reconstruction gets better so here according to human judgment if you just reconstruct from the first layer you don't get very good reconstruction but if you incorporate the signal across many layers of the neural network so you're basically trying to match to predict many layers of signal different layers then the reconstruction gets really good and you know we know these from things like style transfer you can modify how close you are to the original or to the target by basically seeing which layers and how many you reconstruct to which accuracy so this makes kind of sense so if you only take the first layer and to match the features then you get basically this blob here but if you get layers one through seven you get pretty ok ish thing that looks like this thing I guess these are without the deep generator Network so far but now that is interesting and I think the novel thing here is one of the novel thing that they actually use multiple layers of the neural network to reconstruct the interesting thing is I think this is pretty cool they can now do this with these shapes so these shapes aren't natural images and they have not been seen in training but still as you can see when the human sees for example the plus shape it will get you a pretty clear plus shape and it happens for a lot of these things right here so these are actually I would say fairly okay-ish reconstructions of what the human sees here neuron that is fairly neat and for the alphabetical letters and shapes you see again the even the pixel correlation now is pretty high but human judgement again high and here the human judgment kind of makes sense right if you ask is it like this shape or this shape then it makes more sense to evaluate it like this so the shapes I am fairly impressed that they can reconstruct these shapes and what they're now trying to do is they're trying to infer imagined images so basically they're telling a human please imagine an image and they show it to the human it's so it's not really imagining it's basically recalling they show you this image and then you whatever close your eyes they take the image away and you just try to imagine that and you can see through the reconstruction process that works out you know sort of ish right you can see that the cross here it kind of comes through and this the plus kind of it sort of comes through and so these are the high accuracy these are actually the samples where it worked and there are also samples where it didn't work like here where you see it doesn't really come through that so they're either there is you know really a difference between imagining something and seeing something or this method just isn't very good per se and you actually need or humans aren't really good at imagining like there's lots of explanations and here is the same thing if you imagine these images now they report that if humans just recall or imagine these images then the reconstruction doesn't work at all so that might be to the fact that you know you cannot in your recollection you basically just remember the important things about something you don't remember the exact pixel values and therefore your visual cortex doesn't respond in the same way I mean it's interesting even per se to think about it but I have my doubts about you know this entire system so I don't want to make too many conclusions here about these things suffice to say that sometimes it actually can read your thoughts because if you just think so this is this stuff up here if you just think of a shape it can sort of kind of a bit make out the shape that you're thinking about all right this was it for this paper I'm basically mainly you wanted to show you what I found and I'm I have to say I'm pretty impressed with this even though like this is a laptop this is not a VCR this is a VCR it's more of a nearest neighbor thing really than I reek instruction I think but that's my opinion right yes so if you like this give it a like subscribe if you're still here and I look forward to next time bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник