# What India can learn from DeepSeek

## Метаданные

- **Канал:** Varun Mayya
- **YouTube:** https://www.youtube.com/watch?v=UR3HGeldsAQ
- **Дата:** 11.02.2025
- **Длительность:** 23:44
- **Просмотры:** 253,931

## Описание

Special thanks to Yandex for supporting this video, go check them out by clicking on this link: https://bit.ly/4170Hxp.

Read the Medium article for more information: https://medium.com/yandex/yandexs-high-performance-profiler-is-now-open-source-95e291df9d18

In this video, we breakdown three crucial questions about AI: Should India build its own LLM? What does the new Chinese model DeepSeek mean for AI? How can YOU actually make money in AI? What’s fascinating is how the answer to all three question is the same.

Watch as we decode how DeepSeek disrupted the game by cutting AI training costs from $100M to just $5-6M, and what this means for the future of AI development as bigger players adjust their strategies and join the the table too. I’ll also share how we built the world’s largest AI avatar platform, generating 400M+ views across channels, and how we did it by playing our cards right.

00:00 - Introduction
01:49 - The AI vs Software Development
03:00 - OpenAI's First Big Bet
04:27 - Deep Seek Changes the Stakes
08:33 - India's Place at the Table
14:27 - Playing the River
16:01 - AI Avatar Success Story
20:39 - Closing Thoughts

## Содержание

### [0:00](https://www.youtube.com/watch?v=UR3HGeldsAQ) Introduction

okay so I want to answer three questions number one should India build its own llm number two this new Chinese model deep seek how does it work and what does it really change for AI and number three what you should build in AI to make money actually all three of these questions have nearly the same answer but to explain it to you we are going to play a game of poker yes nothing explains this scenario better than a game of poker and trust me whether You' played poker or you understand the rules or not you are going to want to watch this video ready here goes so we're at this table where the biggest AI companies in the world are playing poker there's open AI here there's Google here and then there's a new Chinese model deep seek here there are many other players at the table too and we'll come to them in a bit now they all have different amounts of money with them for those that don't know the rules of Poker you don't actually need to know it fully for this video all you need to know is that there are five cards turn by turn they get to decide whether to put money and open the card you can open the card only if you put in money and then everybody else on the table can see what card you put up and for them to play for all the other players to play they also have to put in money so if you don't have enough money to play or you don't have the courage to put it in you are out of the round every turn like I said a card is opened the goal is that in the ending you need to have a better set which includes the cards on the table and the cards in your hand than the opponent you don't need to understand all of this for the video all you need to understand is that every card opened changes the risk profile of the game we're talking about is building artificial

### [1:49](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=109s) The AI vs Software Development

intelligence building artificial intelligence is different from building software in software you typically know what you're going to build you have an idea of the end product and it's just about putting in money and people to build it out in a certain amount of time in AI however you have no idea what the next breakthrough is so you have to do research to figure it out breakthroughs in research are very random if you look at the platform archive where papers are put up you'll see that every day 50 to 100 new papers are put up you have no idea which methods and which ideas are true or even work many of those are just theories with very little evidence to support them and many ideas haven't even made it to the papers so let's go back a few years we're on this table all cards are down nobody has figured anything out let's say Google the company out of many things they're working on in research they put out one paper called attention is all you need and it's about this new architecture called Transformers but they don't pursue it because they don't know if it'll be useful as a product or not so they open this card they look at what's in their own hands and they say well we don't feel confident to keep betting bigger so they just leave it alone now for a long while nobody does anything but suddenly this scrappy

### [3:00](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=180s) OpenAI's First Big Bet

little research firm called openi says I think I've read that paper and I believe in it I have a belief that maybe this new technology Transformers this new architecture will be useful to make AI I'll put some money in so they bet money the next card opens the second card opens and it's an llm everybody on the table all the other players on the table realize this new thing works llms are actually useful now Google which played the first card in terms of research says oh we have money this thing is useful inspired by the confidence that the llm card is actually a valuable card to continue betting on they decide to play see the second card opening changed the risk profile of the table by training a large language model everyone on the table was attracted now Google has big money and open AI understands that so they go to their friend outside the table Microsoft and they say boss this Google dude is going to play now and he's going to play big money I need money to keep playing to see the rest of the cards to keep playing out card so they do a deal and Microsoft offers them money now two cards are face up on the table open a and Google have several ideas and guesses on what the next card is but nobody knows all that's true is that it requires a lot of money to keep playing on the table now the stakes have become higher so a lot of players are not playing because they're scared but now

### [4:27](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=267s) Deep Seek Changes the Stakes

let's come to the hero of the story story is a Chinese Quant fund called high flyer Quant that's not really a pure AI company but a sort of financial hedge fund they decide to play they've made some profit so they can buy some chips but not as much as the big boys and they realize that to play they have to change the rules of the table a bit since that last card opened the second card opened of llms they knew that opening the next card was going to be expensive so somehow in order for them to continue playing they had to reduce the cost to play itself so through some Innovation with reinforcement learning they make a model called Deep seek they realize that they can teach the model with just a reward system that encourages correct outputs and good Chain of Thought in other words it's not fed pairs of this is the question this is the correct answer to learn from like other models instead it has something called a reward function that scores whether the answer is correct for tasks like mathematics coding puzzles Etc and also rewards it for producing a structured Chain of Thought like what is it thinking what is the internal monologue it's just given a thumbs up or a thumbs down and very often through automated means like if it's doing a mathematics problem a calculator at the back end can tell it whether it's right or wrong so in a very automated way it's able to get correct feedback this is how alpha zero mastered chess and the game go purely by playing against itself no human labeled correct moves were provided over many trials it discovered efficient strategies on its own deep seek r10 does the same for math problems programming challenges and logic puzzles it plays through thousands of problem scenarios testing out different lines of reasoning until it lands on the correct one because the model is not Guided by neat human examples it can develop very unconventional ways of expressing its logic you might see mixed languages like English and Chinese in the same sentence or even weird tokens because internally it's only chasing the question of do I solve the task correctly it can look like gibberish to us but internally the model symbolic or token usage might help it solve problems more efficiently it's not limited by traditional language rules this alien reasoning often disappears if you force the model to produce more humanfriendly text they also have a new approach to mixture of experts so the model effectively Taps only 37 billion out of 671 billion parameters at once which makes it much cheaper to run these two things along with many other optimizations along with the fact that openi had already turned the llm card face up they had already demystified llms which means you could train on their model outputs basically the card on the left opened was by open Ai and because deeps saw that it learned from it there was lower risk profile all that deeps had to do was rewrite the rules of the table to make it cheaper to play and then they turn the next card face up now the risk profile of the table changes again training a model until the previous card cost $100 million now it just costs 5 or $6 million to train a model of open AI 01 size5 or $6 million means that now a lot of players can come to the table and play and it changes everything it changes the story that openi is telling Microsoft to keep funding their game against Google it changes the ideas and the story of the Stargate cluster of gpus and it changes the attitude of the table Master Nvidia that all players pay commission to play it invites new entrance and makes this battle more competitive and the chance of winning less likely it now also makes people hold on closer to their own cards it makes people more likely to cheat and try not to open a card on the table for others to see till they're mentally and financially sure they can take the others on but if you want to build foundational models how should we think about that where is it that a team from India you know three super smart Engineers with not 100 million but let's say 10 million could actually build something truly substantial look the way this works is we're going to tell you it's totally hopeless to compete with us on training Foundation models you shouldn't try and it's your job to like try

### [8:33](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=513s) India's Place at the Table

try anyway okay so now I want to talk about India building models see there's a risk profile of each card here the first card cost a lot of money to open because Google was spending on a lot of research one or two out of the many things they worked on may have worked that's true gambling you need billions in profits to do that and you have no idea what will work once openi came and the second card was opened we knew okay $100 million to play to train a model still makes no sense for everybody else to bet but here's the thing there's an audience around the table and they're asking our Indian players to play yes there are some Indian players on the table the audience will always cheer on a big bet but they will laugh and boo when you lose because the risk and the money is not theirs the audience ultimately wants to be entertained so they shout they boo and they tell TCS and infosis hey you're sitting on so many chips you have so much money play loser why you not betting but TCS and infosis have a lot to lose not really the money they're sitting on but reputation and because of that money in a different way because play the story out let's say a TCS or an infosis commits the resources and builds the model the llm space is quite commoditized the utility of whatever they end up building is very low compared to the amount spent plus can also distract them from The Core Business they're basically going to spend money to turn the previous card up only the card is already up and asking them to turn it up again there's also a small risk of failure maybe they try building a model and it doesn't work out well for us random people on social media this failure doesn't matter much but for a public company it'll reflect in the stock price they will lose billions because of what the failure represents which is that they are potentially incapable of building a model the CEO will lose his job plus take on a lot of flack from the shareholders if the expected return on investment was good maybe building that llm maybe it made them 10x more money or if the threat was perceived as existential maybe if the company would die if the if they didn't have their own llms it would still make sense to take the risk but the bigger you are the harder you fall and there are other LMS and they can just pay one of the other people on the table for access to API of their models this is scary for most of us because of what it represents the lack of talent and being left out of the race but I've met a lot of really good Indian researchers it's not that India lacks the talent these researchers would still like to go work at Fang or open ey because it's a lower perceived risk profile if Talent is taking the risk they want to own bigger chks to the outcome or have a much lower risk profile so instead of going to a TCS or infosis where they're not sure if they'll be able to pull it off they would rather go to the big players like an open Ai No matter how many chips the Indian players have they're still the smaller player on the table compared to a Google and cannot convince Talent or the markets to play with them that's why I keep saying that this risk has to be from a startup in India and it's also why startups have a right to win in changing markets because a startup has no reputation to lose we think of startups as companies but startups are not companies startups are experiments to see whether a company deserves to exist in that space or not it's almost like you go to the table you're like a drunk who has a little pocket change and you say well I connected a little bit from everybody I'll play and the audience doesn't even know them well so they're like whatever good to see that you're going all in with a very little amount of money the startup is the underdog but don't be fooled most of the time these startups die they lose in one round and most of what the audience is asking them to do is uncover the same card face up that's already face up on the table the real challenge is predicting the next card the fourth card on this table sounds depressing yes but the existence of deep seek changes things with open methods that others are able to replicate deep seek actually put out a paper on this now that people are able to see this and the fact that deep seek has proven that you can drop the cost of training it allows small startups to play with a much smaller amount of money while also allowing the larger Indian organizations to gamble because it is a much smaller bet by the way speaking of cost reduction Yandex recently open sourced perforator that helps cut server load by up to 20% by helping developers spot and eliminate code inefficiencies it lets you analyze code in real time to pinpoint how much each line of code costs the company not just this it even provides live insights into server and application performance translating to saving not just Millions but billions for big companies it not only trims infrastructure expenses but also reduces energy consumption aligning with sustainability goals too finally with perforator developers can write cleaner more efficient code helping businesses save not just Millions but billions a big shout out to Yandex for partnering with us on today's video okay I have a few more things to tell you but let's summarize if you're on the same page with me so far doing new research is like sitting on a high stakes pooker table with all the cards fa down you have no clue if you'll win or lose because it's unproven territory and you must bet big it costs a lot of chips to even play compute cost alone will run you into hundreds of millions it's like playing blind you can burn through months or years trying something that might turn out to be a dud when it does pay off you become a Pioneer like open AI once was but that window is narrow open AI proved certain methods work so the cards that they revealed are now partly face up you can analyze their success see their ideas see some of their code and then jump in with much less risk deep seek shows that you don't need $100 million to train a Cutting Edge model maybe just $6 million and that lowers the Buy in price for everyone but it also means more players at the table more competition and when someone's already shown you cards on the table your risk drops but new methods that's still blind betting now let's

### [14:27](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=867s) Playing the River

stop right here and talk about you may say well what can I do I can't play with these Titans but you can be smart about it in 2023 I wanted to compete in Ai and I'd run businesses for 10 years at that point I knew the games I can play and the games where I didn't stand a chance and I stopped listening to the audience on these things because the risk profile for the player in the game and the audience are different so I asked is there a way to play on any of these tables but on the last card is there a way to play where somebody else is taking most of the risk I don't have to take the risk and nine or 10 times fail trying but play on the last card remember I told you that the risk increases card by card and that last card on any poker table we call the river in AI that last card is called fine tuning is a part of machine learning where you can use somebody else's models freeze most of the layers then take a tiny part of the top and train it on your own data so you don't need to fund the compute for the base model you don't need to figure out the AR architecture and you don't need to bet blind like I said it's like waiting for the last card the river in poker you see almost everything and you're deciding whether to play or give up in the last second fine tuning is a conservative bet it's good for bootstrap companies like us because you minimize your risk the big AI Labs have already done the heavy lifting I also believe that if you really think about your consumers your buyers your users they don't really care about the poker table they just care about what value they're getting and to some extent I think we've proven that playing on the last card is not a bad idea so let me give you an

### [16:01](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=961s) AI Avatar Success Story

example so I was in the world economic Forum in Davos last week and YouTube has an official guide to Davos and I'm just going to open a page on this the official YouTube guide where it says my name but also says something very interesting which is that we are now the world's largest AI Avatar Creator with over 60 million views now that was several months ago in November we did something close to 100 million views with that Avatar and most of you that have actually kept track of my Journey have known that we've been trying this since Feb 2023 before anyone had a working Avatar not only were we one of the first few known video and audio avatars content creators in the world we're also currently the largest it's a lot of pressure because now that everyone knows about this there is going to be competition and we may not be the largest in the future but what is it started as just a fine tune model in fact now you don't even need to fine tune to make an avatar not only that we've been doing advertising on the avatars and the brands are extremely happy because they're not buying my personal time I'm not spending my time there they're buying distribution it works for us it was low risk because somebody else has taken the risk of building the models and all we had to do is fine tune the top and now we don't even have to fine-tune it makes money and at 100 million views a month it's at massive scale with close to no involvement for me it's actually the largest consumer AI rapper seen by people in India and here's the weird part because of so much competition among the base model providers for avatars and because open- Source avatars like Tango are 90% there they'll get better in the next 2 3 months we know that at some point we may not need to even depend on any third party that's why I keep my eyes on all the new models in fact we're repeating the same thing for b-s if you look at text to video there's a model called hun that's open source and we're currently working on fine-tuning a model with my Clips on it here's an early version and as you can see this early version is not very good we prompted vun picking up fruits but it'll get good just like it got good with the avatars the technology might change the base model we use might change but it's the same thing again in fact that advantage of the avatars combined with my wife's AV video editing School allowed us to deliver entire videos both for ourselves as well as for customers which then allowed me to scale my company AOS in fact most of you don't know that we have many more avatars running across many channels for ourselves and our customers in fact all that RCB Virat kohi dubbing or the Bangalore police with AI Avatar behind theen in view of the 17th midnight Marathon event organized by Rotary Club of Bangalore and it Corridor or so many of the channels we do for our customers along with ourselves that's all us and it wouldn't be possible for us to just put out the sheer volume of videos that we're putting out across so many different channels without having avatars in poker this is called playing on the river the last card it's irrelevant which Bas model provider wins we can win some piece of the outcome anyway and if there's enough competition among the providers we may walk away capturing way more value than them and if there are more breakthroughs that keep coming the cost of training a whole model not just fine tuning but a whole model might keep going down and once the card is up everyone will know the architecture in fact there are now full Windows app Trainers for fine-tuning models it's very easy and whoever has the best data wins the two kinds of people who will win are the people with some very specific type of data that other people cannot use like clips of Varun Maya that are only useful to train a Varun Maya model and the other side of people who will win are the people who build useful products on top of the models you can call them rappers but actually it's just skipping the poker table watch the big guys fight and just integrate the best model who cares if it's GPD deep seek or something else if your product is useful you can make money by providing utility for your customers and ultimately building a business is about customer utility not about what technology you're using behind the scenes especially when the technology is something that everyone can do and everyone's able to replicate and the costs are constantly dropping your customers are rarely going to be ml Engineers on Twitter it's going to be the general public that just want something useful for all the crap that people give rappers they seem to be a far more sustainable business model compared to the underlying models at a much lower risk because you're playing on the river but even in products there is competition eventually everyone who's built products in the past understands that whoever reaches the most amount of customers will win it seems like I've been saying for a very long time that distribution is key now you might say

### [20:39](https://www.youtube.com/watch?v=UR3HGeldsAQ&t=1239s) Closing Thoughts

one last thing but Von you know pioneering the thing is valuable taking the risk to make it happen spending $100 million training a GPT showing the world what's possible it's worth its weight in gold in Legend and I agree with you open AI is legendary but remember how I told you we're surrounded by an audience no matter how the audience pushes you to bet they always forget you took the BET later and they'll boo you when you lose the audience is fickle today openi the company that started this war who literally took the research that Google has sort of left on the table and actually implemented is being called out on social media they're being called names they're being called losers and thieves and murderers and non-innovators so thing I've learned over 12 years total now of running businesses right and 2 years of this specific thing is that people only remember the winners the eventual winners The Last Man Standing no matter what crazy thing you tried nobody cares if you don't win nobody cares that I was actually trying 3D avatars four or 5 years ago that method didn't work I didn't win back then in fact a lot of you want to do research for the admiration it brings the admiration from that audience but I'm telling you they don't care unless you win can you name the researchers behind Refrigeration you remember Percy Spencer who invented the microwave no but you do remember Coca-Cola though right you remember sugar water that's the sad part of our society that's why I believe that it's important to first do well then if we make enough money we may have the access to open some face cards we may be able to turn some ourselves and I'd recommend the same to all of you deep seek is amazing because it dropped the cost of opening new cards if something like this happens again and drops cost 10x again we can all Train full models from our laptops at home regard less remember that the customer doesn't care remember that people the audience doesn't care they just want something useful from someone they trust at the best prices as long as you remember that apart from how the audience feels with changes every week the last week in AI with deep seek changes nothing for the builders if you are the kind of person that's creating value for everybody else using the models it doesn't matter who really wins and if you are a researcher or you're interested in India getting ahead remember that it's not about spending all that money to reopen the same card somebody else is open it's about putting the talent together to be able to open the next card whatever that might be and that is the real Crux of this video anyway that's it for me and remember you're always free to chart your own path you shouldn't be listening to what anybody else on social media tells you about what you should do with your time what the country should do I think it's pretty unfair for a person sitting here to tell somebody else how to use their money or how to take risk I don't want to be just like the audience I will put my money with my MTH in the things that I believe I can win at and avoid playing the games or partner with companies playing the games that I know I cannot win at and I think that is really what my 30s have given me compared to my 20s early 20s where I thought I could take on everybody I think being smart matters anyway that's it for me make sure you subscribe bye

---
*Источник: https://ekstraktznaniy.ru/video/12057*