State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

12:41

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

Yannic Kilcher 01.04.2020 3 531 просмотров 88 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Peer Review is outdated and ineffective. SOAR is a new and revolutionary way to distribute scientific reviewing and scale to the new age of faster, better and more significant research. https://arxiv.org/abs/2003.14415 Abstract: Peer review forms the backbone of modern scientific manuscript evaluation. But after two hundred and eighty-nine years of egalitarian service to the scientific community, does this protocol remain fit for purpose in 2020? In this work, we answer this question in the negative (strong reject, high confidence) and propose instead State-Of-the-Art Review (SOAR), a neoteric reviewing pipeline that serves as a 'plug-and-play' replacement for peer review. At the heart of our approach is an interpretation of the review process as a multi-objective, massively distributed and extremely-high-latency optimisation, which we scalarise and solve efficiently for PAC and CMT-optimal solutions. We make the following contributions: (1) We propose a highly scalable, fully automatic methodology for review, drawing inspiration from best-practices from premier computer vision and machine learning conferences; (2) We explore several instantiations of our approach and demonstrate that SOAR can be used to both review prints and pre-review pre-prints; (3) We wander listlessly in vain search of catharsis from our latest rounds of savage CVPR rejections. Authors: Samuel Albanie, Jaime Thewmore, Robert McCraith, Joao F. Henriques Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher

Оглавление (11 сегментов)

Introduction

all right hi everyone today we're looking at state of the art reviewing a radical proposal to improve scientific publication so this has been on my mind for a while the review process for modern science especially machine learning is just broken I've spoken numerous times about the fact that we need to replace it with a better system and now Samuel Albany at I'll have actually come up with such a system and we're going to explore it today I am a big fan of this work and I'm 100% on board with this so they basically say

What is peer review

peer review forms the backbone of modern scientific manuscript evaluation if you don't know what peer review is in machine learning right now if you have some genius idea so here is your idea that's a light bulb by the way you write it up into an eight page PDF yes it must be PDF and eight pages he's submitted to a conference which is some kind of so you've made it to be accepted in a conference proceeding and if the conference organizers of course are just a bunch of people they can't review these 1000 million applications that come by themselves so what they do is they recruit experts which are called peers so peers are other people these are called peers and they have usually written up their own papers and they can critique each other's paper and they decide what gets accepted and what doesn't now I've spoken numerous times of how this is super noisy right now they're way they're not enough peers they're not experienced enough so whether or not your particular idea gets accepted is extremely dependent on probability on a coin flip usually and it's just overloaded and just makes no sense and they asked the same question is this fit for purpose in 2020 and we

Requirements

need to replace it and support so they you can already see they of want to automate this away with their state-of-the-art review score and the score will be an out of 10 score that can be integrated into something like archive and you know display right away so they have some requirements to this new system what should it be done it should have the ability to scale very important our current review system doesn't have this right their current review system relies on other humans reviewing your paper and that means that the reviewers need to scale with the amount of papers which just isn't the case currently so a new review system must have the ability to scale right and them you know automating the reviews away or scaling it up in a distributed fashion does this speed yes because right now if I submit my manuscript for review it takes them months to review it and our science progresses faster than that so a speedy more speedy version of peer-review is definitely required and then consistency and this is the most shocking part right there is a the ground 2014 new rips experiment which concluded that 57% of papers accepted by one committee were rejected by another committee and vice versa reviewing the exact same paper different committees came to completely different conclusions to an astounding degree so basically you're flipping a coin of whether or not your paper gets accepted or not and I think this is just not acceptable and so they propose these three things speed scale consistency and their new methods certainly has this now let's jump down

Three axes

here where they introduced this state-of-the-art reviewing s/o a are so they say okay the quality of a scientific work can be judged along three axes efficacy significance and novelty so there are these three pillars right efficacy which means is kind of how effective is your work in achieving the goal in machine learning that's usually to train some good classifier or something like this then the other one sorry is significance right significance is how relevant is what you've done to the field right and the third one is novelty so you know in your scientific work should be an original contribution to the knowledge of mankind and therefore it should be novel right so the more of these three things you have of course the higher your score should be and here in the middle is where the highest scores should be so imagine this is kind of a landscape and so you want to grade papers along these three axes now they have a pretty good method of assessing these in an

Efficacy

automated fashion so first of all assessing efficacy so efficacy they say is best assessed by determining if the proposed method achieves a new state of the art I think that's not really I don't think you can really doubt this I mean this is this is kind of the gold standard of whether your paper is effective is whether or not it achieves a state of the art I mean I it might be a bit of a controversial opinion but if a paper doesn't achieve a state of the art it's you know why do you even care like no one cares so from they say from an implementation perspective they can use the hand kind of abused a fact of the current research environment is that you don't actually have to review this yourself but the author's themselves can be relied upon to state this repeatedly in the text right and this is important so the author's will state that they have state of the art many times in the text if they have actually achieved it if they haven't achieved it or not so sure about it they probably won't repeat it as many times this is can be can kind of abuse now to distribute it basically you know

Distribution

imagine now these all of these reviewers they don't have to do this work anymore they can just distribute to all the authors of their own papers right because the authors in the text by the way the text structures is kind of an NLP approach to reviewing kind of NLP mixed with game theory right so the other authors themselves if they have state of the art you you'll have to do some stemming and stuff but they will put that into the text a lot so it's a

Count

bit controversial but the authors here proposed to simply count the number of word occurrences of state of the art in the case insensitive very important in the text right it stands to reason that a higher state of the art count is

significance

preferable of course alright so the second thing so this might be a bit controversial the second thing significance and they now make the claim significance is measured by efficacy so they simply the efficacy term so if your paper is effective at achieving its goal you can also say it's significant for the community because again significance like if you have state of the art then your paper is significant if you do not obviously not significant because why

novelty

should it matter if you don't have state of the art in a given task it's useless all right so we weigh it twice that's pretty good and then novelty now here they take much of the same approach they say the authors probably state this so how much do use the word novel in their manuscript will dictate so here they say okay the novel wow this is failing me hello how much they use the word novel in the text will probably be an indication I don't think so though they

related work

do the smart thing of they include s the works they include the they includes related work section from this sorry they exclude they say we make the key observation the individuals best place to make the judgment are the author's themselves since they have likely read at least one of the works site in the bibliography I don't agree here I think a better method would be to simply count the number of references and a lot lower the amount of references to relate at work the higher the novelty right because if you think right if this these are current papers right and you place and your paper is here right you'll have a lot of related work right so it's not as novel if you were like way out here you will have maybe one or two related works so it's way more novel if you have less references so this would be my kind of criticism of this so there's this novelty thing here I think this term should be replaced by kind of a graph centrality measure or something the account of how many references you have would be enough probably all right

score

so they define their score as we saw is the SOTA term weighted twice a geometric mean between that and the novelty term which I've criticized and they add the suffix out of 10 because out of 10 score is pretty interpretable so they divide by 10 here so yeah say they say that here we as attach a suffix out of 10 because that's easy to interpret and as you saw in the indy kind of archive implementation right here sorry this will be then easy to drink the grate right here so they even give code right they give code in the paper themselves of how to implement this it's pretty easy and I think yeah even though it's a quite a short paper it's thorough and it's a good new method and I think this could revolutionize publishing and they even so as a bit of a bonus they give the official pronunciation of state of the art reviewing which is something like pretty smooth yeah with that I hope you enjoyed this and if the authors could just be a little more subtle next time that would be great I guess you'd have to go yeah it's nothing more bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник