[ML News] Devin exposed | NeurIPS track for high school students
17:47

[ML News] Devin exposed | NeurIPS track for high school students

Yannic Kilcher 27.04.2024 41 098 просмотров 1 433 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
OUTLINE: 0:00 - Intro 0:21 - Debunking Devin: "First AI Software Engineer" Upwork lie exposed! 07:24 - NeurIPS 2024 will have a track for papers from high schoolers. 13:29 - Opus can operate as a Turing machine. 13:47 - An AI-Powered, Self-Running Propaganda Machine for $105 14:27 - TechScape: How cheap, outsourced labour in Africa is shaping AI English 16:25 - Is ChatGPT Transforming Academics' Writing Style? References: https://news.ycombinator.com/item?id=40008109&s=09 https://www.youtube.com/watch?v=tNmgmwEtoWE https://www.youtube.com/watch?v=xE2fxcETP5E https://twitter.com/itsandrewgao/status/1779369373737668669?t=omW3DvRNmZyce8oo0Ehf1g&s=09 https://twitter.com/0interestrates/status/1779268441226256500?t=tGwngUpChSD2YZ0VQDJHAA&s=09 https://twitter.com/thegautamkamath/status/1778580754785550819?t=Qq1nLUIOyfRfBbZ6BHdXPw&s=09 https://twitter.com/vipul_1011/status/1778619720964419930?t=225aakPnHb-ojIjveaWkkg&s=09 https://twitter.com/avt_im/status/1778913195408626110?t=UPtduAKTX1uvq8Wa_EQOWg&s=09 https://arxiv.org/pdf/2402.05120.pdf https://twitter.com/ctjlewis/status/1779740038852690393?t=AhIQM4rBUim-IWEkXL7OVQ&s=33 https://www.wsj.com/politics/how-i-built-an-ai-powered-self-running-propaganda-machine-for-105-e9888705 https://twitter.com/ylecun/status/1780728376283521191?t=rbTfUT7IWzXy83fvr-f4hw&s=09 https://www.futureofhumanityinstitute.org/ https://www.google.com/search?q=alex+hern+guardian+delve&oq=alex+hern+guardian+delve&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIHCAEQIRigATIHCAIQIRigATIHCAMQIRigATIHCAQQIRiPAtIBCDQ5NTVqMGo0qAIAsAIB&sourceid=chrome&ie=UTF-8 https://www.theguardian.com/technology/2024/apr/16/techscape-ai-gadgest-humane-ai-pin-chatgpt https://arxiv.org/pdf/2404.08627.pdf Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Оглавление (7 сегментов)

Intro

hey how's everyone doing it's beautiful Monday and this is a few selections of news around the release of llama 3 and 53 uh around these big announcements so some of this stuff is going to be a bit older than that uh but I think it's still cool to check in with what's going

Debunking Devin: "First AI Software Engineer" Upwork lie exposed!

on one story that has garnered attention you may have heard this was this ose on Devon if you have followed the trend Devon was this automatic software engineer that's been released and I have made a new segment about that and there's big you know big hyper on Devon is a system that has a programming interface so this is here how Devon looks it has like a chat where you give it instructions it has a shell it has a browser it has a code editor and it has a planner so what it will do is it will do coding tasks for you now that being said they did a big announcement and they sent out some demo videos or on YouTube they put some demo videos one of these demo videos was Devon solving an upwork task upwork is a platform where you post things to do for people and some of those things are programming things so you'll say hey I need a script that does XYZ and then someone else will take it and will do it and will give you the result and you pay money for it so it's that gig work platform they said that Devon actually solved uh task on upwork this video by internet of bugs goes into a deeper analysis of exactly that and also we have a different video and that's of this person computer vision engineer who is the original author of that upwork posting so at the beginning of the video is he says hey you know I've seen this Devon announcement I watched the video and oh he recognized his own posting I will sort of cut a short you can watch these videos I'll put the links into the descriptions to what it comes down to there's multiple problems with how Devon was advertised very is what it actually did and some of that stuff even is visible in the video itself for example the biggest issue is that the task had a very different description from what Devon actually ended up doing so the task was essentially hey this code repository is a bit old can you help like what do I need to do to make it run on an ec2 instance it would have consisted in reading the read me file and running one or two commands of the read me file to set up the environment correctly so this person who posted it just didn't have time to try this on that particular instance and that's why they gave it out as an upwork task because they were like whatever someone just tell me how it works that was the task now what did Devon do now to be fair Devon didn't do that the point is Devon delivered something completely different Devon did like code fixes it solved a bug or something like this there was no bug I mean I guess there were bugs somewhere but the task was not about solving a bug Deon just sort of went out and did something part of that is to blame on the users like The Operators of Devon who didn't even input the actual upwork tasks but just input the code repository and then said something like I'm not entirely sure it says somewhere in the video what exactly they input into Devon but it wasn't the actual app upwork task they input something like can you please solve the bug or something like this I'm not sure anymore in any case they didn't input that and then Devon went out and just did changes and it fixed some bugs but it turns out it actually introduced those bugs then it referenced files that never exist in the repository and so on so it turns out Deon just kind of swoops around and does some stuff and then by doing some stuff it introduces some bugs and then it's like oh wow these are bugs let me fix the bugs and then it fixes the bugs and at the end it's done things but not the things that were in the description so on one way you can say yeah it's probably a bit Shady marketing to release that and say hey look here is an example of De solving in real world upwork task on the other hand you can say it's probably about equivalent to 50% of upwork work that you'll get so you know who's really in the wrong here I do have to say this is somewhat to be expected even though I'm a big fan of AI code models like GitHub co-pilot and so on I do think the sort of planning ahead and comprehensive understanding of stuff isn't necessarily at the level yet where it can be used therefore I would have expected that this happens to Devon but the fact that it happened even in the task that they sort of advertise as hey we've solved the real upwork task with everything around like the output not even being the thing that the task asked for is a bit astonishing there is a solid summary on Hacker News if you want to read kind of a short and condensed way and there have been a few sort of voices on Twitter more prominent Voice or more seen voices on Twitter people saying hey you also have to be a bit careful here for example they never said they have solved software engineering and that's true they didn't the video was not sped up it was one of the other criticisms that the person here making the video pointed out that some of the timestamps indicate that Devon was running for probably much more multiple hours or even days until the task was finished but it is also true I don't think they have ever really claimed that it does anything differently or Tred to deceive that where I stop agreeing is where it says something that cognition Labs can't be blamed for the hype that's just Twitter I other users who share their use of Devon have been very honest and so on this user might have been very honest but as I pointed out in my video on Devon this was a clear PR campaign like this was an orchestrated PR campaign they had two separate articles in I believe Business Insider ready to go on the day that Devon was released including pictures of the team and so this didn't come about just by accident they had a heavily coordinated campaign to make as much fuzz about Devon as possible and a lot of the sort of claims attributed I recognize Twitter then makes up its own story but a lot of that I am very sure can be attributed to them and can be partially blamed for the hype I don't think voices being like oh they are just the Tinkers they tinkered something thing and then people took it and said oh wow this is Agi but the Tinkers they just wanted to Tinker like no they business their startup they manufactured this giant hype partially by overclaiming things and this is the result of it not everything but there are also some and this has been fairly uh often viewed tweets like this where people have completely misinterpreted the video that the original sort of expose by internet of bugs and by the person who originally posted the upwork post so if you read stuff around Devon I recommend you just go to the videos watch them for yourselves before you like read opinion about the videos that of course does not include my opinion because my opinion is the correct opinion any other opinion be very skeptical in other news NFS

NeurIPS 2024 will have a track for papers from high schoolers.

introduces a track for papers from high schoolers so NPS you know the conference for research the lowest kind of there are Master students submitting to nurs there are independent submitting to nurs but it's mostly PhD students postdocs professors and so on this is a world leading professional research conference and now they have a track for papers from high school students now I do see the appeal of sort of broadening research making research available to people on younger age encouraging them pulling them in and so on like if there is a young mind that's brilliant and just wants to do machine learning why not you know let them submit why not let them be part of the movement part of the community all of that is totally fine but I have a bit of a different take and if you are on Twitter you may have seen that because that was quite popular I'm talking a lot about Twitter today I'll probably stop with that my issue with this is that the necessary knowledge not only to do machine learning research but to effectively write papers on machine learning research write them up in a way that will be accepted and so on is not introduced in standard curriculum not even the curriculum for the elective curriculum and so on until bachelor's or Master's level certainly not in high school and that means the children here are going to be children of frankly either academic parents or very rich parents and to me that is a bit of the wrong approach in broadening the scope of the research Community this will not be a selection for the brightest kids around this will rich kids kids whose parents are equipped and able to help them with this now some people say who cares it's about the advancement of Science and whatnot also to those it should matter because if you truly believe that you know some bright minds are out there and we can invest resources into discovering and advancing them then certainly our resources will be better invested actually finding those good Minds than simply restricting ourselves to the narrow subset whose parents can help them write a paper I'm not a fan of this because I do believe that a lot more people should have access to higher education if they are obviously suited and capable and willing and motivated and so on a lot more people should be part of the research community so on we do in fact need the best Minds on this and there's already a huge selection pressure to just get kids from academics and kids from rich people into these tracks which is you know it's not their fault they were born in a rich family right or in an academic family or not you can't like shouldn't punish anyone for that however you shouldn't put extra resources which are finite towards push ping that group even more you would rather put the rare resources towards actually go grabbing the people who are brilliant who would be brilliant but because of their family circumstances because of just attending a regular public school would never get the idea to even do machine learning research and if they do you could do that go out find the kids who like are interested and then help them write papers because they have no one in their life who can help them write papers any of that put the resources there this will just like select the ones that are already there that is a bit plus it even more so now these high school students the ones that already are on an ex sort of success path due to their circumstances now these people will even get a leg more up once it comes to actually applying for a PhD because they've been publishing since high school whereas someone else will only be publishing since Master's right which is even that is crazy but nowadays you will see that you will see PhD applicants already having a paper which is not the most common thing usually but nowadays it seems to be common and now you got all of these people who are already benefiting and now have been publishing since high school and a lot of people have written back to me and said well the internet is available YouTube is ail you can educate yourself you don't need academic all the resources are there and there will be some people like I did it I come from a poor family I did like yo fine okay of course there's examples of people who did it but in general the problem isn't that YouTube doesn't exist or the internet doesn't exist the problem is that if you grow up in an environment where this is not even a question ever a topic or an idea or a path that is communicated to you and neither do most public schools communicate to hey by the way how about you submit to nurbs here is how you do it if you don't are not in that environment it's not a lack of skill it's just a lack of information Outlook and so on that's why I have the most problem with this because what you would want to do with finite resources is go after exactly these people who are skilled who would be brilliant but just due to their life circumstances they are not anywhere near even having the idea of going on the internet and educating themselves about stuff and if they do they would have no clue how to write it up and submit it anywhere that was my rant memes are good about this high school isn't that too late for a publication there's also on Chinese social media one professor already ask their PhD students whether they can help write a paper for one of their kids so they can get into college abroad more easily yeah it's totally it's totally about the Goodwill it's totally not about children of academic and Rich parents in better news Claud

Opus can operate as a Turing machine.

Opus can operate as a touring machine so I the user Lewis has done experiments on actually using it as a touring machine with a tape and symbols and so on deducing the rules from that super interesting you can check out the code of that the Wall Street Journal has an

An AI-Powered, Self-Running Propaganda Machine for $105

article the Saturday essay how I built an AI powered self- running propaganda machine for $105 I paid a website de how I buil bu developer I built pay don't think you exactly understand the words I built but fine AI generated pink slime new side programmed to create false political Stories the results were impressive and in an election you're alarming yeah this is a total winefest why like why there's an interesting article in the guardian of

TechScape: How cheap, outsourced labour in Africa is shaping AI English

a story that we have tackled last time we have looked at the word delve being completely over represented in chat GPT outputs and you can detect whenever a group of people uses chat GPT more because the word delve will be super over represented and so will a few other words be this article goes to depths of explaining why that might be and is really cool and essentially comes down to yeah this is what I've shown popm articles you using the word delve it comes down to the crowd workers who filled this or made a lot of the data supposedly for open AI in that and they do tend to be in part Nigerian so in Nigeria delve is much more frequently used in business English rather than in England or the US so the workers training their systems provided examples of input and output they use the same language eventually ending up with an AI system that writes slightly like an African like an sort of business English from English-speaking Africa I thought that was just very interesting that essentially a style of language like an it's like an export of a dialect that now seeps through things now it would be interesting to see if people actually also start talking more like that because obviously they Now consume a lot more text and lot more communication from other people I've gotten emails with the word Del in it and arrow back hey did you happen to use an llm to generate that and the person actually told me yes I did it was a sales person I did not end up buying from them but I found it highly funny and because we're now consuming more of that does it mean that we might also take over some of the accent that is present in that training data which would be super interesting it'll be interesting to see in a couple of years how the actual language of people has changed in response to the change in language in of language models also this paper is very interesting going into the

Is ChatGPT Transforming Academics' Writing Style?

question is chat gbt trans transforming academics writing style the paper itself says we find that chat GPT is having an increased impact on archive abstracts especially in the field of computer science where the fraction of chat gbd R abstract is estimated to be approximately 35% if we take the output of one of the simples prompts revise the following sentences as a baseline we conclude with an analysis of those positive and negative aspects of penetration of chat into actual academics writing style now as I said that might very well be but I do think that the word change rate here could also just be because the topics have changed significantly they do have quite a bit of other investigations I know the word significant is definitely some one that is overused by chat gbt and you can see that these definitely go up here and so I would say this is much more solid evidence than the above plot of change and influence of chat gbt all right that was it thank you so much for being here as I said a few updates are around the big announcements of llama 3 and 53 I sure hope that until the time this video is released there's not some other big announcement and if you watch this video there was a big announcement this will be funny and otherwise it will just be a person saying something bye-bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник