# The Founder of Adobe’s First India Acquisition | Man Behind Viral SRK Deepfake AI

## Метаданные

- **Канал:** Varun Mayya
- **YouTube:** https://www.youtube.com/watch?v=JS44nVWWbUk
- **Дата:** 29.12.2023
- **Длительность:** 1:01:00
- **Просмотры:** 54,364

## Описание

I'm Varun, and this is a show where we get about the future of tech, entertainment and business. While other podcasts scratch the surface, I think it is important to meet the best people in their fields and go much deeper. Today, on the set, we have none other than the man who’s turning text into video. Ashray Malhotra, founder of Rephrase.AI.

Ashray Malhotra, the Co-Founder and CEO of Rephrase.ai, leads a company that leverages artificial intelligence to streamline video creation. Prior to his role at Rephrase.ai, he gained experience at notable organizations such as SoundRex, Google, Techstars, and Goldman Sachs. Malhotra is an alumnus of the Indian Institute of Technology, Bombay, where he earned his engineering degree.

Rephrase.ai specializes in automating video production through Generative AI. This technology enables users to efficiently broaden their audience reach. The platform offers diverse applications, including the customization of videos for sales purposes, enabling characters to speak in augmented and virtual reality environments, among others.

Malhotra's achievements include being listed in Forbes Asia's "30 under 30." He has also participated in several startup accelerator programs.

Checkout https://www.oneplus.in/open

00:00 - Highlights
00:46 - Intro
02:00 - Ashray's background and journey with AI
03:32 - Key insights in building AI avatars
05:00 - Learning from customers
07:27 - The challenge of audio vs. video AI
11:43 - Interesting AI avatar applications
14:16 - The use of AI avatars and Deepfakes in politics
17:56 - The challenges of detecting fake AI content
19:55 - The story behind the famous Shah Rukh Khan campaign
26:29 - Shah Rukh Khan's reaction to the AI avatar
27:54 - Key insights from the campaign
28:49 - Predictions on the pace of progress in AI
35:32 - Pessimistic vs. optimistic views of current AI capabilities
41:27 - Open source vs. closed source for the future of AI
47:30 - The importance of distribution over product
48:52 - Avoiding impatience and the importance of compounding
50:36 - Hardware vs. software - where to make bets now
55:43 - Is AI here to stay or a passing fad?
57:11 - Limitations of AI usefulness
59:16 - Advice to young people worried about AI
01:00:08 - End Notes and Gifting Ashray a OnePlus Open
01:00:49 - Outro

## Содержание

### [0:00](https://www.youtube.com/watch?v=JS44nVWWbUk) Highlights

how did shuk feel my respect for shauk since that campaign has gone much higher like he's really smart he understands more about video technology than most people you think in the elections they're actually using Avatar SL deep fix right now 100% we had a request from Kenya willing to pay us a few coros where his simple goal was you know take the politician on the opposite side make him speak personalized videos of lots of variations of that U but I can't give you money I can give you land in Kenya once AGI comes in then there's absolutely no value of smartness hence you have say like a threeyear window for everything that you can possibly add value to in the world one last piece of advice that you have for young kids were whated about their jobs AI can't write emails like it's not going to take your jobs you're fine AI is not as useful in its current state as most people make it

### [0:46](https://www.youtube.com/watch?v=JS44nVWWbUk&t=46s) Intro

out to be today's guest has actually sold their company to Adobe and is one of the pioneers of generative AI in India they were working on generative AI before anyone even knew what generative AI was it didn't even have a name Asha and me have chatted for a while and anytime I find a new paper in virtual avatars I'll bounce it off as and be like what do you think of this and then when I do a video he'll be like what are you using for this so we had that kind of relationship now you may not know who he is but you definitely know his work remember that famous Shah ruk Khan Cadbury campaign ashre was the CEO of rephrase the company that actually pulled that off and today for the first time ever he's going to actually go in depth discuss the entire Shah ruk Khan audal how he got Shah ruk Khan to do an AI Avatar of himself and all his predictions for AI a lot of which I actually happen to disagree with so it's going to be a very engaging conversation you are going to learn a lot by the end of the conversation we're going to learn what does he think the future holds where there are opportunities where you should definitely not step what is the future of deep fakes and avatars in politics and in day-to-day life we are going to learn a lot ladies and gentlemen welcome to another episode of OnePlus open conversations a thank you so much for

### [2:00](https://www.youtube.com/watch?v=JS44nVWWbUk&t=120s) Ashray's background and journey with AI

being on and you've had a hell of a journey from IIT Bombay all the way to Goldman Sachs doing this so firstly congratulations how do you feel um I think it's exciting for us that now we get to build what we wanted to build for a really long time without constraints of short-term monetization so we're really excited nice and Asha you've had a very very long history we always do one question about what was your past how did you get here like what's the journey like I saw your Tech Stars you had a wild Journey can you summarize that Journey for us yes uh in Tech Stars uh just to set context in March of 2019 we I nishit and Shivam uh we started rephrase and got up on stage on Tex stars and said that we want to build a magical blackbox that takes any piece of text input and creates a Hollywood level movie um for us on that long-term Vision the v0. 1 one was the ability to create real human avatars um it was just surprising that instead of spending six months and building those avatars we ended up spending four years and we ended up realizing it was a much larger market than what we had thought and I'm sure we'll dive deeper into the story but yeah it's um in the last year it's been interesting how the world has changed its belief from saying well what you were doing is absolutely crazy and it never happen to saying well hey it'll happen in the next three months and what you do after that uh the reality is always in the middle uh but yeah it's it's been a very interesting Journey tell me two

### [3:32](https://www.youtube.com/watch?v=JS44nVWWbUk&t=212s) Key insights in building AI avatars

three instances while you were building these avatars where you just had a complete shift in view like where you were like hey I I've done it like this but no the solution is something else like what is it what's roughly going on behind that black box in our very early days uh we had considered traditional computer vision techniques and now I'm talking like really long back can you explain that for the audience like very simple words um there are two techniques to do computer vision uh the traditional techniques are like very rule based um things and machine learning based approaches are where you basically stop writing rules you just feed enough data and let the machine learn on its own uh an example that I that like really resonated with me was imagine if you had to build self-driving cars uh one way of doing that is you know H you actually program here you know on this top right you will see a light a lot of FS conditions exactly it's a loop of FS conditions right um but there's always this next edge condition that you haven't uh planned for whereas in machine learning you don't do the conditions you just feed in data you tell a machine saying well hey on this situation this is what you do uh and the computer learn learns on it own on it on its own so uh earlier we used to train small models um and try to like do a lot of data supervision so that the machine would work over a period of time we realized you know you just trust the machine more uh it does a really good

### [5:00](https://www.youtube.com/watch?v=JS44nVWWbUk&t=300s) Learning from customers

job yeah what was one or two insights that your customers I think the first time I saw rephrase Avatar a year ago or a year and a half ago I don't remember when I first saw it I was impressed I was like for the time for the Alternatives it was great so uh what are one or two insights you learned from your customers how are people using it where we just like what and also about the industry um yeah we've learned a lot to put into context there are two major applications that we used to serve first was long form content where you know imagine a 30 minute course content or a 5 minute marketing video uh and the second was personalization where you we take one base video uh in most cases that base video is actually recorded uh and we change Us 2 to 5 Seconds of that recorded video the advantage of that is if 90% of the video is recorded only you only change 5 Seconds um you know the human perception says well everything else is Real This should also be real so it's like it's good for those marketing Campa where on WhatsApp you get some influencers saying hey Rahul you do this or hey Rahul buy my course like that sort yeah exactly that kind of stuff so that's personalization Studio would be uh creating an entire course content right like both of us could be avatars here instead of talking against being here in person that would be a long form use case for us um so these are the two major use cases um U as we've learned over time one of the surprising insights has been that and this is a mistake that I made um I took a beted that the largest use case for longterm content would be marketing um as it turned out at least the way industry is shaped today the largest use case of longterm content is actually in corporate learning uh in HR um so like internal training and stuff that's a larger market than marketing as of today yes and and that is a mistake that costed us dearly um so a very hard-learned lesson um in the early years actually in the late years of the pandemic uh you know we had to pick a market and I decided Well who wants to sell to HR uh like why would I ever do that so we decided who wants to sell to HR we'll just sell to marketers uh much more fun people to talk to uh much larger budget sizes and all that except that where the quality of the Avatar is today it works well enough for coroporate learning not well enough for marketing um so again uh a very interesting Insight interesting and you were saying

### [7:27](https://www.youtube.com/watch?v=JS44nVWWbUk&t=447s) The challenge of audio vs. video AI

something about the industry in general right before we got on this conversation where you said the surprising thing for me is how audio is lagging Behind Video because there are two parts to this is the video part and then there's the audio part and I've learned this the hard way right it's just so hard to get super accurate audio and video is just now so good why one of my co-founders dug up a article which in the 1990s a new text to speech engine had been discovered at that time uh and the headline of that article said text to speech is a solv problem right like now you get to hear textual speech which is as real as human being and you will not be able to distinguish uh real voices from artificial voices um when we started the company we did believe in that hypothesis for the first couple of years actually till the r or shuk campaign uh we did not have audio in house at all uh our approach always was that we would specialize in computer vision in the videoos and we'll Outsource the audio to someone else uh it could be open source could be you know lots of other companies but instead of doing everything in house they'll just do one part but as it turns out audio is really hard um it's easy to get to audio with 90% I'd say 90 95% accuracy for again I I'm not sure what the exact reasons are uh as of today uh audio ends up actually very often giving out that some videos are air generated and it's not the video it's not the lip sync it's sure in some cases it could be the emotions as well but in most cases it ends up a beinging audio which is why you see most softwares in this industry offering you an option to upload your own audio and generate a video on top of that just because uh audio is very so tough but you think it would be soled I would love to make predictions uh except that uh in the world of AI predictions are so hard to make that uh I don't know um I would want it to be solved but I'm not sure um again not sure because you think the ethical constraints of this are going to be crazy or you think it's going to take a lot longer than we think it will it's very difficult to predict uh breakthroughs in the world of AI um it might be that tomorrow we have that magical AI voice solution that we don't have today uh but it could also just take much longer um and to keep myself humble I very often remember that uh when my dad was buying a car in 2018 I told him that's the last car that you're buying after which every car that you'll buy will drive itself and you never have to drive a car I was wrong um the next car that he will buy is something that he will drive yeah um and the way the you know the media was talking about it the way the technology was honestly growing really fast and everybody was Raising billions of dollars saying well you know cars are going to drive themselves yeah so well I'm all in for advanced Ai and I would hope for you know videos to be perfect and audio to be perfect I just don't know when that'll happen yeah I think the way I um measure this is I think you have to decide in terms of the percentage of the population that can be fooled I think where we are at with video and you saw the nin deep fake right it happened here uh when you saw that uh you can tell that there's a section of society that's fooled that wouldn't be able to tell that's fake my wife couldn't tell and she works with us where we you know make some of this stuff for me and all that she couldn't tell she like why are you showing me such a boring video I was like what till the end um so I think maybe there's 10% you would be able to figure out because you worked there right I'd be able to figure out if something was fake or I'd be able to tell something's wrong but I think there's a section of society that can't and I think you know there are repercussions to this like the elections and whatnot we'll come to that we'll talk about that um but you're right about the audio part it just feels like something is missing and uh I thought it would be solved a few months ago I was even I would admit that I was wrong about audio it's taking a little bit longer there are ways to solve it like with RVC and all where I input a voice and I've already got emotions so it will map that to somebody else's voice but in a way that's cheating it's you're making a model but for you know transforming somebody's voice rather than generate synthesizing it from scratch so I think um yeah another interesting uh your

### [11:43](https://www.youtube.com/watch?v=JS44nVWWbUk&t=703s) Interesting AI avatar applications

previous question where you asked about what do some interest like the most interesting thing actually two examples I can give of interesting Avatar applications that came across ourway um in 2022 which was a few months after we done the shauk Khan campaign um we had a request from someone from Kenya willing to pay us uh a few coros uh where his simple goal was you know take my uh take the politician on the opposite side um uh make him speak like personalized videos of lots of variations of that um and I'll give you this much money U but I can't give you money I can give you land in Kenya uh and was so funny like conversation was like we've at re have' never done any land deals uh yeah any land deals we've not done any political deals uh but yeah that was just a very weird conversation um the second was again a similar I was in CES last year um and we had like one guy was it was a little uh I was a little weary of him because uh on your badge uh his company was crossed out um basically just made a fake entry and then he like mentioned lots of applications of you know well hey you know I can use you here and here and like uh I'd love to get started with you immediately and like large brand large deal values and when he told like he gave his email address it was clear that you know he used to work one of the top five Pawn sites in the world and so yeah um those have been very awkward interactions of in the world of avatars but uh by and large uh our customers have been amazing they've used avatar for really good uh use cases I think the core thesis of even personalization the ability for every individual to be able to connect with a large number of people can have great positive impact as well yeah for me to be able to connect to my audience for all of them to be able to do a call with me and that's not just video and audio right it's also my thought process exactly you got to Port your thought process which I think the solution is fine-tuning the text models right and now that we have mril and Mixr and all the others I think we are ever so close to that world where you can talk to me and I can scale myself I don't know if that's going to D the value of me right because I think also with a lot of creators uh you got to be very careful about not overexposing yourself to the audience for they're creating content every day blah blah eventually you if you feel too approachable sometimes it doesn't work out but also there are creators who have been very close to the fans I think mainly

### [14:16](https://www.youtube.com/watch?v=JS44nVWWbUk&t=856s) The use of AI avatars and Deepfakes in politics

politicians will really benefit from this cuz I once spoke to a politician and he was telling me about generally you know I was like what wins a vote and he's like nobody's going to read the manifest like the vote bank is not going to read Manifesto they're not going to care about all this but if I have attended their funeral their dad's funeral or their grandfather's funeral or something they will remember they won't care what I'm actually doing for the election what I'm actually going to do for the neighborhood but they will care that I came to that funeral once so he's like if I that's a personal touch like if I can replicate that Personal Touch where I'm talking to them I'm able to get on a call of them and console them and it doesn't say anything that puts me in trouble cuz then they can record it and be like look this guy said this but he didn't do it um that's a win in a way I feel like it's kind of bad because then it's sort of you don't know what your avatar saying I agree I mean to that point I think intelligence and smartness and like the most but the person who really cares should win the election rather than the person can who's the most tech savvy so uh but yeah to I know next year I think one and two or one in three people in the world are participating in an election and yeah companies are going to make a lot of money or or these people are going to use open source to create lots of personalized videos you think it's happening right now you think in the elections they're actually using Avatar SL deep fix right now 100% interesting and no repercussions nobody knows how to deal with this yet I know that uh deep fix have been used an election last the year before and the year before that as well so that's been fine what I'm but I'm okay with that uh the problem is it's a slippery slope the all the examples that I've heard of till now are a particular political party uh creating Avatar of their own leader and communicating with everybody that I'm okay with the problem is once that political party starts to create Avatar of the other side and then make that other side speak something or look foolish yes that's a problem um what I'm afraid of is that that'll start to happen next year yeah I think it's already happening I mean not the Deep fixed part but people making fun of the other side dude that's like they just need dirt and the thing is before the you know what's going on before the elections actually happen uh you've had something like let's say one day before the election you have somebody on the on your site send you a message saying uh my name is Rahul I'm leaving the election you vote for the other guy just one day before by the time you figure out what happened this that you're done you've lost the vote cuz the person has seen it on WhatsApp and like I told you right not the entire of population can't tell even today I think the tech is good enough where most of the population can't tell especially tier 2 tier three towns no chance it's true uh I just wish for two things to happen first you know instead of happening this one day before the election it happens like a year before the election so that people it it g gives people some time to build the understanding that uh the way now I think people even in tier 2 tier 3 India have started to acknowledge that a WhatsApp message might not be very real um if someone like just sends you a text on WhatsApp does like just fake news has been spread enough that people built resilience to it like people would at least think or question once before believing that's Urban India um uh you would you think it's not happening no chance no no I think the immune system will take some time and it will take some mistakes and we haven't had enough time or mistakes happen yet I agree with the it takes a bunch of mistakes to build that immune system which is exactly I'm hoping these mistakes start now instead of like one day before the election because I am absolutely sure that one day before the election you could cause some really bad stuff uh tech companies could

### [17:56](https://www.youtube.com/watch?v=JS44nVWWbUk&t=1076s) The challenges of detecting fake AI content

also do a lot of uh stuff there uh for example uh there are lots of initiatives across you know almost all lar tech companies where if you could have in the metadata um of both at the time of yeah signature recording the video then editing the video and then using it for air generating this signature could be highlighted for the recipient so say if you know YouTube shows in its description that this video is air generated based on the metadata no I don't think it'll work because of capitalism there'll always be one guy selling to these political institutes saying I will not put a watermark pay me more and as long as that market exists I think you will it's hard to tell what do you guys what do you think of um deep fake detection I've heard a lot of people claim we can do deep fake detection Avatar detection uh I've tried it with some of the outputs from you know hijen and stuff it's not accurate um a good friend of mine runs a company called is reality defender in New York um where they do exactly this um I am bullish on that you think it'll get to a good point it's like virus and Antivirus right like uh both sides will continue to become better and beat each other but does not mean that one of the sides has to give up uh um there is very much need for defect detection uh so any effort on that direction I appreciate the problem is the incentive to create the virus and the money you make creating the virus is far more than the money you make defending the virus uh unless the incentive for defending the virus goes up except that I think now enough financial institutions have lost a lot of money because of deep fake voices uh you know there have been real crimes um I've been in touch with some law enforcement bodies in the country and even they want to at least at they're at the point of will tell me more like teach me more so um you're right uh I think there's always slightly more incentive to do bad things than do good things but it's catching up yeah you know let's take a

### [19:55](https://www.youtube.com/watch?v=JS44nVWWbUk&t=1195s) The story behind the famous Shah Rukh Khan campaign

detour here I think you guys really Rose to fame when you did the shuk campaign you guys did a mega campaign with Cadbury and shuk it was everywhere I saw it on Twitter Instagram even now there'll be some posts from some marketing companies saying look at these guys they did a shuk deep fake run us through the entire process what was it like I can go a few months even be even back um in late 2020 we believed that our technology was almost ready to go out so in late 2020 early 21 uh we launched uh a selfs serve uh product um where people could come in create their own videos it was priced at $25 $50 something like that um and that was the first attempt to go out to Market uh we realized that even to sell us to sell a $50 subscription was taking us really long um and like either need to have an Enterprise level reward at the end of this long sale cycle or a s cycle needs to be short um so it wasn't paying off we had uh little money in the bank at that point um and like just every day I would feel defeated saying well you know what's the point of doing this where like you have this amazing technology uh you made $50 today great uh how do you feel about your life um so we put together a small team of people um whose job was to figure out something else right like while this thing wasn't working and one of the things that I just felt really strongly about is that it should have some application in gifting of some sort um where you know Cameo was on the rise in 21 and I was like Cameo like AI could really disrupt Cameo for those that are watching Cameo is a app where you could get any celebrity to say happy birthday and things like that and you'd pay them some money for it like $5 $10 or something yeah in the US you could pay you would end up paying a lot more um but I did believe that AI could do this and it has some value I don't know what the value of this is it one rupee or is it th000 rupees per video I don't know but it has some value so let's do that um we went to the M's team and Via ailby um and actually in during raki we pitched them a concept saying we'll just put it in uh like put a QR code and let brother or and sister along with giving chocolate boxes um like have gift each other a video as well um a team also came up with a very similar idea at that time and again after a period of time they took a bet on us it was huge for us it was a game changer before that companies would come to us and say well hey so your value proposition to me is um you want me to give you 30 minutes of my celebrity time you want to have the power to make them speak anything that you want um your kids out of college you've done nothing uh like and you really want like my celebrity to be okay with you being able to make them speak anything it's just a ridiculous conversation to have yeah um but they took the bet on us um and although we were like they I have an insane amount of respect for for the M team uh they had a plan B just in case you know we up and like artech didn't work whatever but during raki it was a big success um they didn't Market it all that much because they wanted to have a plan B just in case it doesn't work it didn't work um but it where whoever tried to create it was fairly happy with the output um how did they get shuk involved so once the raki campaign was successful um then they decided to like take a much larger bet um and make it their like primary uh good primary application the previous year they had already done personalized videos but it was personalized videos for retailers where they would just write a text uh like they would write the name of the store and that's it so they made one ad you know just like th000 copies just have different shop names written um this year they decided started to like push the boundaries um they talk to multiple celebrities um and my respect for shuk since that campaign has gone much higher like he's really smart uh with red Chili's uh he understands more about video technology than most people do uh even after we did the shoot with him he would you know uh he got one of our team members uh in the room and explained him don't use a single camera use multi camera setups and you know like do this and do that like it was really good um but like that campaign really pushed the technology um we had to record at that time we needed 15 minutes of video to be able to create a copy of someone um and the monist team was sure that they wanted to make it an ad like a true ad not just like a spokesperson video so they wanted to have him in five different locations and we were like if you change the location you got to do it again like it's a different Avatar so he spoke for like he spoke basically non sense stuff for 15 minutes five different times at five different locations um and it can be really tiing like I've done it I of course done it a bunch of times speaking 15 minutes into a camera and it's tiring so for him to be able to do this five times on a single day at different places um by the way this I think this is the first time people are hearing the story the real story oh yeah there was a lot that happened uh during that campaign um it was I think after that one month of we had a one month gap between the shoot and actually taking it live um that was like one of the hardest like that I think that's the hardest I've ever worked in my life uh that 30 days period um because we were almost like making up parts of our Tech on the Fly for example uh as soon as you color correct the video our Tech our engine would fail we just never knew about this because no one had color corrected footage and given it to us before so we had to come up with Immediate Solutions of you know well how does your Tech now become resistant to facial Corrections color Corrections and all of that stuff um so uh we built that by the time the campaign went live I had almost decided that I'll never do a ad campaign ever again like a TV ad campaign just because like it's just so much work like taking a video out on social on like digital is fine but like to make a mainstream TV ad mainstream campaign of you know one of the largest fmg companies on the largest Festival for them was how did shuk feel

### [26:29](https://www.youtube.com/watch?v=JS44nVWWbUk&t=1589s) Shah Rukh Khan's reaction to the AI avatar

like that the fact that once he saw the outputs and he's like oh my God you can do this how did he feel he was very supportive uh of the of the whole thing which was very surprising uh because to your point something that you were mentioning earlier like a big celebrity could just say have he well I don't want to be that approachable right like uh because no one has done it before uh you don't know how the reaction would be but he was I think in his entire career he has taken multiple Tech bets before um and he was willing to take yet another tech but when he saw the output what was a reaction like oh my God he was happy one of the things that's been important for us is no customer ever sees a video before the person who's the face of the videos has approved it before so at all points he had a veto call that well hey if I don't like the output this will not go live like it's still his identity he needs he has a reputation to maintain so only once he approved the outputs uh did the campaign ever go live hm I mean a lot of other things happened during this campaign as well like it was you know at that time like there was some personal stuff happening in his life with him and his son you know there were impacts of that on what could have happened with the campaign um uh but I think it's all thanks to you know the ailv team the wavemaker team the mon team and you know of course everybody had rephrase that the Stars had to align we all put in an insane amount of work and that campaign

### [27:54](https://www.youtube.com/watch?v=JS44nVWWbUk&t=1674s) Key insights from the campaign

happened what's the one Insight you got from that campaign I think you you agree with a lot of these things but my one Insight from this was technology in isolation uh wasn't that powerful uh we had this technology for like N9 months it was just that when you put in this the most creative people on the planet with the best technology on the planet on like real human technology that's where money is made uh that's when uh Magic happened yeah and why did you learn that because the reason that campaign was successful was not just technology it was the entire story behind that um it was to position it a certain way instead of just saying you know High name to every customer who buys a a Cadbury box uh to do it for merchants and build an entire narrative and like with an intent to truly help um all of that added up I think the world

### [28:49](https://www.youtube.com/watch?v=JS44nVWWbUk&t=1729s) Predictions on the pace of progress in AI

is moving rapidly right I think everything is kind of becoming a tech thing even content right we're sitting in a OnePlus studio uh which is a tech company and I think I have some thesis around where the future is headed and this comes from us building some of this technology uh fine-tuning some models for clients and ourselves I think fundamentally there are two problems in AI I'm just going to broadly classified I think one is Agi which is artificial general intelligence which is can you build something that is more economically can do all the economically viable work that humans can I'm just going to Define it like that some people also Define AGI as it can modify itself and all that I'm going there I'm just saying it can do all of the work that human beings do the other side is embodiment which is how do you take that put it inside a robot or better yet human or better yet take a human modify it modify your personality put inside a robot all sorts of variants possible right I'm an accelerationist right where I think that um it has to go really crazy really fast right I can't see any other way because uh I've also grown up with the computer so I'm a nerd and I don't have too many friends so it kind of this is a side effect of that but a question for you where do you see the next 5 years go 10 years go as I said at the early you know earlier in the Pod uh it's very difficult to make an accurate prediction of where we'll be in the next uh next 5 to 10 years um before we get to the ccii world you know let's just talk about the real world um there were said 20 models that Google had published um a few years back um open I just decided to pick one of those 20 division model um put money behind it worked and then like continued to train larger and larger models and like just feeded more data um and it just worked um we don't know till when it'll work um you don't know where the wall is exactly maybe diffusion models are enough to take us to AGI so we've discovered the holy grail and that's it right like maybe it's that um but also maybe that uh GPD 5 whenever they train it just is like you know 10 times more expensive and like 10% better than GPD 4 and you realize that for the next big thing um you need a fundamentally different architecture than diffusion model is yeah then Transformers are yeah and then you're back in the wilderness like what is that next big thing um maybe you figure out very quickly or maybe you don't um I'm optimistic about things moving fast because I think arguably the smartest people in the world uh are working on this problem and they're very well incentivized to work on this problem exactly and it's not even a money thing it's an ego thing you built AGI that's such a nice brand title which I know smartest people in the world are like I want to win that Throne yes um one of my co-founders very often mentions that uh you know the thing that he has is smartness uh once AGI comes in then there's absolutely no value of smartness yeah hence you have say like a three-year window for everything that you can possibly add value to in the world so your timeline is 3 years uh I mean some of the people's timeline is three years U I don't have a prediction of of when AGI will happen um it I would be like I think it could happen in 3 years uh I also think that the AGI moment in itself is a little overblown um I know Sam Altman recently tweeted saying that the touring test got defeated and the world just moved on yeah nothing happened um so we expect AIA to be like this miraculous thing that like just changes the world and maybe it doesn't who knows it should change the world a lot yeah but the it's difficult to make future looking predictions in the next one year uh the predictions can be a lot more uh accurate right yeah uh you're seeing lots of companies do text to video text to video will happen um you'll see uh a lot more companies come up with you know Hardware specialized to AI you'll see all of these models which are running the cloud start to run on your local devices so in the next year or two you can be a lot more certain about predictions after that we'll just have to see in fact I think VCS in the space underestimate the probability of a complete change from a Transformer diffusion model architecture um which should just completely change your Ops Pipelines like yeah I mean even if you're putting a $100 million in a company today you're putting it in the team like I think that's the fundamental change right you're no longer looking at the metrics of the business and you're like this is growing 10% a year most of the business is Raising 100 Mil like a mistl raising 100 Mil you're betting on the team you have no idea what's going to happen and like Google when they came out with Gemini their entire thesis was uh it's pure multimodel right and it's using Transformers but it's pure multimodel it can do everything text audio video Runway came out with a new video I don't know if you saw it recently they came up with a trailer yeah the world model General World model right which is like hey actually a human being is not just going to read text and consume text and talk they're also going to look at things hear touch things can we codify some of that not touch but can we codify the rest and I think that's beautiful uh and I have no idea where that leads but also I know for commercially viable tasks AI is kind of getting there yes kind of getting that for a lot of tasks right I'm talking about digital tasks I'm talking about content writing there are two kinds of content writing right one is you're doing generic stuff you're just doing SEO farming right that AI soles and I saw somebody do an SEO Heist recently where they stole a lot of keywords from a competitor by just cloning all their page titles but writing AI generated content for it but there's also the kind of Journalism where you go and do investigative stuff which AI is not going to be able to do anytime soon so I think all of this is very nuanced but I think when you have General World model and it can actually talk to people and pick up the phone and dial as and be like what happened that's when it starts getting a little bit you know weird and touchy General World model is a great name for a model because yes if like there existed a model which knew everything about the world which it could be it would be called a joural world model it would solve everything uh I would not want to make any prediction about a model till the point it's strain till the point I can actually you can use it yeah I can use it right like it's very easy to overhype all of this stuff so uh again all promising approaches we'll just have to see which one wins and

### [35:32](https://www.youtube.com/watch?v=JS44nVWWbUk&t=2132s) Pessimistic vs. optimistic views of current AI capabilities

what's the pessimistic Viewpoint my pessimistic Viewpoint is I think gini was in such a big leap over GPT four and the pessimistic Viewpoint is and I think you got to hold both right the pessimistic Viewpoint is we've hit a wall and that's fantastic cuz yeah already does a lot of menial work so you can just export the menial work to it you do the smart you do the work that requires the human leg work you do the building the distribution building the relationships and you know use this as your back office I think we're already kind of there um but I know it's ALS but also if you look at the last one year it's just not stopped in it hasn't stopped at all like uh Chad GPD happened in November of last year which is that's like 13 months back um the last 13 months is I think is the only time when AI has like when Gen has is only 13 months old for like 99% of the people out there the last 13 months have been absolutely crazy without a doubt my Pim istic view is uh my pessimistic view is how good anything in AI is yeah uh AI is you know like it the current version of AI is a great teasing moment for almost everything that you could potentially do but what can you really do with it um I used it to write my emails and then I stopped it doesn't write good emails yeah it doesn't write the email that I would like to uh send yeah but have you find you in the model yourself to write like you a little bit it's still just like generic yeah it's generic it's somebody needs to put a shot of Personality into it one of my co-founders again uh used to say that gpt3 was like a fifth grader trying to your way around an answer when you don't know anything yeah now maybe it's like a 10th grader yeah it but if it's my email it should sound it should have more information than a 10th grader email should have yeah um this podcast uh maybe sure like from a video and audio perspective uh you know could be outsourced to an AI yeah but the real content today is in your head yeah it's in your head and like it's in my head right like so is GPD really all that good and then when you start to question that question almost every aspect of it which we've seen in a lot of our customers as well um everybody will love your first couple of sales calls when they start to use it for really long at that point they're like well they start nitpicking exactly uh so for real world use cases at scale which is the kind of stuff that we want to actually we talk about when we talk about AGI is AI really there today no yeah and that's a pessimistic view yeah that's why the Microsoft paper they had some paper out saying Sparks of AGI we see some evidence but we need more evidence to conclude that this can do a lot of commercially viable task because technically you can pass the touring test for some kinds of people today like if I'm a bank and I want to send like a closure notice or whatever I can use AI for that and most people will be like okay whatever they're not going to care um if you're call dming someone it could be useful if you know you generally write generic messages or the person responds to generic messages um but it wouldn't you wouldn't be fooled by a GPT comment somebody commented on your video with a GPT answer I would know and I often call them out I was like why are you using GPT to you know answer like this but I also feel having fine tuned some of the mystal stuff it's going to get there got it because till now everyone who's putting their own data into GPT is using a method called Rags uh and before that they were using Lang chain which is kind of a variant of it through Vector embeddings to be very honest a lots of people who really defend it uh very verbosely so I don't want to piss them off but I think it's garbage because we've tried it for clients and clients are like this is bad and then we fine tune it and we get better outputs so I think fine tuning is the approach but fine tuning takes work you got to go find your data clean it up put it away so right now I'm working on something called a policy document for myself it's wild but I've had this dream of automating myself I have a 3D avatar of me that I tried in Unreal Engine 2 years ago before metahuman even came out then after metahuman I use you know like I have a 3D scan on my face which I put into metahuman I've always been trying to do this because I hate I love producing videos but I hate sitting down and creating content like this is what happened in the world today I don't want to do that should be automated and then I said okay there's a better approach with AI so we mve there right uh and I will not stop till it can also write scripts like me right I will not stop so I've written a policy document what do I say what do I not say I'm being very verbos I'm like training the thing right like training is the wrong word but I'm giving it like boundaries whether it'll work or not we have no idea but I think I'm going to try every approach I'm going to exhaust it because I'm not so keen on building like a foundational model myself I have a very specific task replace me cuz I think then you know you can do a lot more things with your time if you don't have to sit and create content and I only care about that end task and that means I have no allegiance to any model or any anything whatever works and if it doesn't work I would love for it to work and whenever it does please let me know uh but yeah that that's a very ambitious goal yeah I I do know that for example today you could um if you had to create a 30C shs today um if and if you were to train it on all the shots that you've ever done till date yeah it could do it yeah uh but could it produce a 30 long no and that's the Hope Paul Graham once put out this tweet saying that yeah is almost like cameras right the resolution kept increasing people thought like old software just gets better gets no new features but actually just that it got more accurate over time it started with the dream machine that just kept hallucinating and slowly we're getting it to hallucinate less over time it's almost like the trend line is accuracy rather than Improvement in features and capability which is a nice way to look at it that's an interesting take yes yeah hey so I have another

### [41:27](https://www.youtube.com/watch?v=JS44nVWWbUk&t=2487s) Open source vs. closed source for the future of AI

question right um and I think you know the one place that's never settled in this entire thing is open source and when I tried mixl you can download an app called LM Studio on your computer you can just try these models now it's like one click you have a GPU you're good you're sorted um Mixel is good the 7B into 8 is good and open source just doesn't stop and the way they dropped it they just dropped a torrent link and they like do whatever you want with it I really think open source is going to win my thesis uh meta is doing some great stuff what are your thoughts on open source versus closed Source Financial incentive to win a market versus lots of people contributing and excited about it what wins if the example is is mistl is has a has very much a financial incentive to win it at the end of the day yeah like mistl is not trying to do like is they're not exactly like they've raised at a couple of billion dollar valuation uh they' those are smart investors who put in money who want money back yeah um so if you just look back at history um open source has won some Wars for example Linux runs the world uh they beat Windows and Mac OS uh so a bunch of smart people sitting in a room could build some really amazing stuff and that's great um the problem in the world of AI is uh to be able to train a core Foundation model you need at least hundreds of millions of dollars uh to be able to train these model at scale at that point it goes beyond just a couple of you know crazy kids sitting in a d room being able to build this because you now need a couple of crazy kids with $100 million and that that's not a common set um so if you look at from that lens uh it's actually now more difficult for open source to compete um against people who have some Financial incentive um now if Facebook decides to continues to open source Lama 3 and Lama 4 and you know M continues to open source its stuff yeah um I absolutely believe that say the power of the community that L to was able to garnish with like you know everybody almost contribute working for free for Facebook and giving it you know hair cheaper systems and you know well you do this one minute week to your engine and works a lot faster like yeah that that's great um but I think that's it's a fine balance yeah I would not go like it's very likely that at some point once the financing for ai Ventures dries up a lot of the companies will have pressure to like not open source models and make money because we're seeing this with stability yeah now they put a subscription to it they're like if you want to use it commercially then you got to pay us some money so I can see it but I also think it's weird right it's almost like meta is like I can't I don't know what they're thinking but they're like maybe open eyes won this Gemini is going to do this they're going to be the two commercial models we need a different path and I've seen this I've seen them do this in VR it's weird but when Apple was going to announce the Vision Pro and everyone knew it was coming out and they're going to announce it meta dropped the quest three trailer it's like hey we're $300 and they're like we're taking they're always taking the contrarian path and what I like is they're spending all that money training the models and just giving to the world for free so I feel it'll be some combination of like a big Enterprise looking to win brownie points or looking to build a large community and a bunch of you know people using that technology to build and fine tune models like we fine tune models and I think uh without a Lama I don't think this is possible even mistl it's like an offshoot of llama it's like actually a lot of people keep complaining about mistl saying oh they raised 100 millon but they didn't build a foundational model they're using llama and fine tuning it who cares who the hell cares they're training data they have you know separated the training data they are the ones that have you know they're taking responsibility for the outputs and I think we're seeing more and more of that where somebody comes and says actually my model's better it doesn't matter how it's trained doesn't matter what the base model is like OnePlus has this AI uh music studio where the output of you put these three things together you choose a mood you choose a uh you know a style and you choose you know a theme and it gives you something nice I'm sure they've not built a foundational model for it but it doesn't matter the end user is happy it's on their phone so I think that will be the future and I'm sure you're seeing some opportunities there I agree with the core thesis of what you mentioned I don't have a final answer will open source win it or will close Source win it what I am sure of though is this is an absolutely amazing time to be a end user of AI because like everybody is fighting all over to get you the best solution possible um you have I very often I wonder that you know what the difference between say you and me and say Elon Musk has deteriorated a lot at $20 a month we have access to the best llm that's it it's just $20 a month um the phone that you would use you know is like what, $2,000 that's it yeah there it's there's it's not that like at $100,000 there exists the better forone it just doesn't yeah um so uh everybody like call it the victory of capitalism in some sense but like everyone's trying to like you know democratize everything serve the customers it's a great time to be a customer of AI yeah AI is a technology right it's not a product in itself someone someone's got to build the product and whoever builds the product um like is the real person who's solving the customer problem adding them adding a lot of value he gets to keep a part of the value for himself and have you

### [47:30](https://www.youtube.com/watch?v=JS44nVWWbUk&t=2850s) The importance of distribution over product

changed your mind on distribution in what sense like when you start a SAS companies all often like product product we got to build the best product but then second time Founders is always like it's got to be about distribution you feel that way it is absolutely about distribution um like if you could pick between product and distribution it is definitely about distribution it's about sales even when you get access to the distribution Channel um uh that I 100% AG with that interesting and do you have one or two like if you were doing this again do you have one or two insights things you would change things you do differently feel very happy with what we did because it helped us accomplish a lot of things uh but if there's one thing that I could do differently I would have pig the market in 2020 and stuck to it for a really long amount of time uh and let it compound uh the pro I think one of the things that we did wrong was that we switched around um and we had to do too many things um you know we worked with some of the best companies in the world on personalization we've created digital avar for almost every big celebrity in the country by now uh we did long firm content in the US uh we worked with media companies in different parts of the world we spread out spread ourselves too thin uh so you do Focus work next time focus and even if it doesn't pay off in the short term you just let it compound um I think we underestimated the value of compounding that impatience

### [48:52](https://www.youtube.com/watch?v=JS44nVWWbUk&t=2932s) Avoiding impatience and the importance of compounding

when you've raised money and you really want to see the hockey stick it's hard to comeb back hockey stick is a big part of it um a part of it also is that uh you want to be doing like the grass is always greener on the other side you see what's wrong with you not with other people on the other side and for those listening hockey stick means there's a mythical point in most startups where suddenly everything starts working and all your numbers go up and to the right to be fair uh when we did the Shah ruk Khan campaign and for like many months after that I did feel that hockey stick um I almost felt what true pmf looks like uh again for a duration of many months uh if I would have been able to take more number of calls I would have made more money um uh so I don't think it's as mythical it exists like the the core concept of pmf which says that you know like customer that there's a point when like customers are pulling the product out of your hands it exists it's real yeah and you feel like um the impatience to get there is what drives a lot of Founders to do too many things you've got to be impatient a little bit in a startup like you can't be like well hey you know but you would You' Focus you you understand that You' feel that impatience so but you know that doing one thing and compounding it over the long term is yeah um if we would have gotten that moment 6 months later it would have been fine except if we would have gained that compound and then compounding happens across everything that compounding happens over your understanding of the customer that compounding happens on your brand positioning about who you are um it happens on your product that you're building and the features become very industry specific and deep so that compounding play is really important value and what do you think of Hardware

### [50:36](https://www.youtube.com/watch?v=JS44nVWWbUk&t=3036s) Hardware vs. software - where to make bets now

we've had a chat around this you know I'm building something in Hardware next year uh I really want to try I'll probably fail but I think you can't call yourself an accelerationist and then not put your money where your mouth is right I want to try Hardware but you've not been a big fan of Hardware you're like dude I will do everything but I won't try Hardware again uh why because I think the opportunity to put a fine tune model inside an action figure or a painting Now's the Time because we got latency was the big problem right you speak to it it'll respond 6 seconds later ah too slow but now deep gram release Nova 2 it's pretty fast like 150 160 milliseconds it's fast enough yeah and you know for context I and Von had almost a 2hour chat about this exact topic and even that day when you asked me whether I think latency is a problem I don't right like so uh you've highlighted latency I've told I don't think latency is a problem why that's not even if it's problem today it'll get S yeah like L there are things that scaling laws can solve uh and that feels like the kind of thing that it can right like as I said there are some things that you can't predict about AI there's can uh this is in my mind one of the predictable things that Laten will go down it's fine the cost will go down Laten will go down all that all of that will happen uh you're seeing sdxl now and like super fast image generation inant exactly yeah so all of that will happen um there are two reasons first I think there's and I know we disagree on this but uh I still think there are real opportunities on software no I don't disagree with the opportunities I just think you need too much money to do it barrier ventry is too low so every kid is doing it and you're just like staying relevant is just tough especially if you're building a very high time problem where somebody's going to go out and raise 100 Mil that's a sales game dude I agree 100% which is why you go after a game which is not that High Time uh go after a smaller niched application you and I live in the AI bubble there are lots of people in the world who like what percentage of people you start gbt like I I'm very often Amazed by how few people are willing to pay $20 a month to get access to gbt for yeah um but I mean that that's the real world like our world is is different so I still think that there are very real uh things to be done in software and software to scale up software is just so much simpler than to scale up Hardware um I would say even at Apple scale but at least as a small company you know your predictability of give me one or two reasons why I shouldn't do Hardware which maybe the audience can also listen to and hear from it's okay I I want to know all the cons um so I and nishit started a hardware company before we uh started rephrase along with sham so quick context I've burned my heads in Hardware so like I from experience exactly everything about Hardware is hard um when you're prototyping it um you know a pro on software debugging is so simple um uh when you're are doing Hardware uh you have to order a PC like of course you'll do some simulations to try it out but you'll realistically order a PCB you know it takes a week to come and then that week you're doing basically random stuff or you're like preparing for something else but yeah it's not the most optimal use of your time if you could have it immediately it would be faster so it just it slows down everything just because prototyping takes long after which you have to do you know design for manufacturer which is different from prototyping um and then when you start to manufacture stuff you have to take calls of how many units you'll sell because based on that you will make foundational decisions about the product you make such decisions in SAS as well right like you make architectural decisions based on how much do you want to support but those are very lowcost decisions in some sense you can reverse it uh in manufacturing either you've over ordered and you haven't been able to sell that much and then you have inventory sitting on your hand and you're losing money or you've underestimated in which case there's so much lost Revenue that you could have possibly made and by that time a lot of other people know about the exact same thing that you're building until unless it's very specialized Hardware you'll then have competitors in the market doing exactly and that's lost of any for you right um again more competitors high c so and then you'll have to deal with customs and you know you'll have to like just everything about Hardware is harder um the advant there advantages to it um there Advantage is some random kid can't compete with you there is defensibility exactly once you have some sort of Hardware um you know the competition Hardware is a lot less than competition SAS uh also once you have a hardware installed you can have some sort of a subscription model uh which you are now uniquely poised to do because you have a Hardware in someone's houses in someone's house so makes sense there are Pros to Hardware as well I don't deny it but I think there are cons as well yeah no I was trying last time I was trying to get as on as a co-founder but then he's married to Adobe maybe for a while okay Asha last question okay and I

### [55:43](https://www.youtube.com/watch?v=JS44nVWWbUk&t=3343s) Is AI here to stay or a passing fad?

have what we do is like we do these hot takes so Asha you've been in Silicon Valley You' also been in India you've seen people hop Trends a lot first there's personal finance then there's crypto then there's AI web3 blah blah what is like is AI truly here to stay or you do you feel it's one more of those trends that is going to be a fad later I think AI is definitely here to stay um whether AI continues to remain the hero feature which sells the product may or may not change but the AI features to a large extent which are being built today those features will continue to stay I do believe AI adds very real value in the kind of stuff that people want people do on a daily basis so uh it'll AI is here to stay and you don't feel it's a trend so if there's a youngster listening to this it's not like oh this is going to disappear 2 years later and just you know uh 100% I think a is here to stay um right now it is on the peak of the hype curve which is why so it's going to flatten a little bit Yeah like right now if you go to a video recording software like it'll say like AI first video recording software and like you go to a notepad app it'll say like AI enabled note first app like that might change like a notepad app just like is really useful in and of itself and that should be enough to sell it there are ux features which matter in there more than just AI so AI might disappear from the headlines but the AI features will continue to stay llms are genuinely useful they add real benefit to customers Liv yeah one thing you believe

### [57:11](https://www.youtube.com/watch?v=JS44nVWWbUk&t=3431s) Limitations of AI usefulness

in AI in the world of AI that everyone else in who works in AI disagrees with I'll start right uh I think Hardware is the is a very good opportunity uh and it's a lot easier than people think it is because the competition is so lesser sure there's some like work that you have to do in the early days to build it up and you know you'll make your mistakes on under ordering and over ordering but once you have a rhythm not every kid can go build an image like there's some millions of image generators now everyone has their own spin on it right and there's so many Civ AI models all fine tuned for different tasks but Hardware tough to get in so my thesis is that's actually where we should be betting next but one thought that you have that other people in AI don't disagree with AI is not as useful in its current state uh as most people make it out to be can you expand on that the same thing gbd still can't write my emails um it's a simple enough problem it's not that like I'm not asking you to do rocket solve a to find out the next uh law of physics I'm just asking to write my emails you think that could be a little bit of prompt engineering right in all lower case to because I know Gro which is elon's yeah if you ask it to roast you and be vulgar and stuff it actually writes like a very different being and it writes like somebody on the street I have yeah just on the way here I was seeing some of do's responses and yes uh it's an interesting take I do believe that AI can have its own unique theme or identity of replies AI can do that um Can it have mine uh I haven't seen it done today and I have fine models yeah interesting dude as thank you so much for being here I think this is the first time we heard the Cadbury story this is the first I hope uh because I don't think you've gone in depth before the have you been on a podcast before I've been on a bunch of podcast before but you haven't gone this deep yeah I I did a lot of them during the pandemic and doing this is so much more fun yeah what do

### [59:16](https://www.youtube.com/watch?v=JS44nVWWbUk&t=3556s) Advice to young people worried about AI

you think one last piece of advice that you have for young kids who are worried about their jobs yeah I can't write emails like it's not going to take your jobs you're fine uh also everybody including AI researchers have been wrong about what jobs they will replace if you were to ask the best researchers in the field like what will replace two three years back they would have yeah even people like Yan Lon were wrong yeah so like uh irrespective of whatever field you are in like I can give you advice about like what do I think which Fields will get to place like no one knows so just like keep doing what you're doing key like what's definitely a field you should not be in one field the kind of fields that have already been replaced by llms today right like very basic customer support uh or say being very bad at sales LMS are a great bad sales rep U yeah those kinds of fields interesting dude so we have one ritual

### [1:00:08](https://www.youtube.com/watch?v=JS44nVWWbUk&t=3608s) End Notes and Gifting Ashray a OnePlus Open

at the end of every episode we talking about Hardware one of the piece of Hardware I really like is just this phone right like just to pack so much in such a small form factor but still have it you know such a thin form factor and still have it be both you know a tablet and you know a phone at the same time is incredible but we gift this to all customers I wanted to make it an experience so here's the OnePlus open the flagship OnePlus open and you know we've taken an hour of your time I think you know this is something we give an exchange thank you so much for being on thank you thanks for having me it's always fun to chat with you with or without the cameras lovely so ladies and

### [1:00:49](https://www.youtube.com/watch?v=JS44nVWWbUk&t=3649s) Outro

gentlemen that was ashre hope you learned something I learned a lot I always learn a lot when I talk to ashre bye-bye and see you next name

---
*Источник: https://ekstraktznaniy.ru/video/12407*