[Drama] Who invented Contrast Sets?

9:30

[Drama] Who invented Contrast Sets?

Yannic Kilcher 08.04.2020 2 876 просмотров 54 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Funny Twitter spat between researchers arguing who was the first to invent an idea that has probably been around since 1990 :D References: https://arxiv.org/abs/2004.02709 https://twitter.com/nlpmattg/status/1247326213296672768 https://arxiv.org/abs/1909.12434 https://twitter.com/zacharylipton/status/1247357810410762240 https://twitter.com/nlpmattg/status/1247373386839252992 https://twitter.com/zacharylipton/status/1247383141075083267 Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

I love me some good Twitter drama look at this this is awesome so after this contrast set paper appeared and I've done a video on that the author of it tweeted it out with there one of thee these long twitter threat with screenshots and all this seems to be the new marketing tool of academics and as you know I'm not a fan of this paper I think that the number that comes out of such a contrast that is very either useless or a counterproductive and you can see my video on that in any case out there was another researcher exactly Lipton who felt like he needed to jump in here saying before the media flips and retweet party gets out of control this idea exists has been published and has the names and a clear justification is called counterfactually augmented data this is amazing look at that and here's the published paper of course and if we look at the published paper this is it right here of course Zack Lipton is an author on that paper and so let's just read the abstract I haven't read the paper but let's just read the abstract it um so I have it classically I have it here my nifty thing here so we can analyze it so this paper if you read the abstract eat those sounds similar right despite alarm over the reliance of e-learning since blah blah spurious correlation so it talks about the same problems and what do they say given a document in their initial labels we task humans with revising each document so that it accords with a counterfactual target label retains internal coherence and avoids unnecessary changes right so this sounds very similar to what these contrast sets do so the counterfactual target label would be the necess neccessity of the contrast set to change the label retains internal coherence which is the in the contrast that is simply given by it supposed to conform to the intent of the theta set makers which the intent probably includes internal coherence and it avoids unnecessary changes that conforms to the contrast set only searching in the local environment of a test set example so you see that the definition of these is pretty similar then we go on and say they say class restrained on original data fail on their counterfactually revised counterparts and vice versa this experiment was also done by the contrast at paper and then they say classifiers trained on combined data sets performed remarkably well just shy of those specialized in either domain so immediately we see some differences as well right the main difference I see is they say we task humans and then they train on the counterfactually revised counterparts which probably means they use some mechanical Turk's here they say humans because if you want to create a training data set you need lots of data so they probably take a data set and run its training data set again through something like Mechanical Turk to get annotations this is exactly what the people of the contrast sets claim is wrong with the current pipeline and they so in here we have this thing counterfactually Augmented stuff so in the contrast sets what they say is we actually need the experts to do this that this the these humans are exactly the wrong people to make the data sets so it has the CFA has some elements correctly the same namely how they construct these labels but who construct the labels and for what reason so here it's experts it's for tests it they say the experts that make the data set should provide an additional contrast test set so this is I mean if this is just my opinion if this is the same idea of course it's very similar but if this counts as the same idea the 95 percent of all research counts as the same idea as something that jurgen schmidhuber has done in the 1990s which of course here in schmidhuber will eloquently argue exactly he invented ganz basically the

Segment 2 (05:00 - 09:00)

same thing so yeah so if this is it's not the same like I have to say this it's very close but it's not as I understand they even cited the other ones so then that the bickering starting this is funny I'm like this is just funny to me so it's a Clinton jumps here it says this has been published has a name and a clear justification it's called contraction the data here is the published paper we just looked at this right and then Matt Gardner answers and he says Zach and diviners work is sorry and Devia deviance work is excellent I recommend you all go look at it I wrote provides a different concurrent take on similar issues right and I think here are some one comments that so he says it is in the related work section although mischaracterized in miss attributed as contemporary work so position is really that it is and of us stole an idea and they were apparently in contact with each other during that so this Matt Gardner here says what the differences are it says we take a geometrical view we demonstrate suit in a wider variety I mean for all intents and purposes if you go through any of the research go to computer vision go to NLP you'll find like the say exact I have like I have I reviewed two papers each year that want to produce data that better defines the decision boundary like these people here I mean this there's ideas just get rehashed over and over in slightly different form these two are particularly close but and then see how they bicker our paper was finished two months after theirs and then they say we started the project well before and so on why do we feel defensive and then he answers again with this is absolutely false our paper was drafted in July your paper was finished the night before the ACO deadline this is not two months ago but half a year it is nothing to do this why do you presume do you know when we started drop the nonsense we did this work in May 2019 present the public results in July poster didn't ever drop the posturing so much of what you're doing here is the very cancer in the system I mean I agree just you know slightly refining ideas that previously were there is very bad problem in academia so this is actually correct to point out but I don't think that this particular instance is particularly bad and then says I'm afraid you're simply mistaken as a history of publishing similar so I've something like the last thing I'm gonna say I just invite you to read this beautiful the last thing to say here if this counterfactually augmented data if this is in fact the first instance of this general idea to produce counterfactually augmented data that that does actually fulfill these criteria I would be extremely surprised because this is nothing to do with deep learning right and the real novelty in her field is mostly deep learning so I'm pretty sure someone must have thought of something like this when everyone was just doing grammars and manual features and things like this so I'm I would be extra really surprised if this hasn't been there in one form or another and why the authors of that should make exactly the same argument that being said it is fairly close like that the fun part here is that it is actually a fairly similar idea except after so the idea itself is fairly similar but here the focus is on different things and it's also on different datasets and I believe yeah as I said 95 percent of research falls into exactly this category so much fun check it out yeah bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник