SQL Data Science Interview Questions and Answers (interview style walk-through) | Question 3
13:59

SQL Data Science Interview Questions and Answers (interview style walk-through) | Question 3

Tina Huang 21.09.2020 17 631 просмотров 383 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Hi friends! Welcome back to SQL Sundays! A weekly SQL data science interview questions and answers walkthrough. This is another data science SQL interview question and answer done from beginning to end in a real interview style. 🔗Affiliates ======================== My SQL for data science interviews course (10 full interviews): https://365datascience.com/learn-sql-for-data-science-interviews/ 365 Data Science: https://365datascience.pxf.io/WD0za3 (link for 57% discount for their complete data science training) Check out StrataScratch for data science interview prep: https://stratascratch.com/?via=tina 📲Socials ======================== instagram: https://www.instagram.com/hellotinah/ linkedin: https://www.linkedin.com/in/tinaw-h/ discord: https://discord.gg/5mMAtprshX 🤯Study with Tina ======================== Study with Tina channel: https://www.youtube.com/channel/UCI8JpGrDmtggrryhml8kFGw How to make a studying scoreboard: https://www.youtube.com/watch?v=KAVw910mIrI Scoreboard website: scoreboardswithtina.com livestreaming google calendar: https://bit.ly/3wvPzHB 🎥Other videos you might be interested in ======================== SQL Sundays Playlist: https://www.youtube.com/playlist?list=PLVD3APpfd1tuXrXBWAntLx4tNaONro5dA How I learned SQL from Scratch in 11 Days to Pass my FANNG SQL Interview: https://www.youtube.com/watch?v=vaD3ZFFNwhM How I consistently study with a full-time job: https://www.youtube.com/watch?v=INymz5VwLmk How I would learn to code (if I could start over): https://www.youtube.com/watch?v=MHPGeQD8TvI&t=84s 🐈‍⬛🐈‍⬛About me ======================== Hi, my name is Tina and I'm a data scientist at a FAANG company. I was pre-med studying pharmacology at the University of Toronto until I finally accepted that I would make a terrible doctor. I didn't know what to do with myself so I worked for a year as a research assistant for a bioinformatics lab where I learned how to code and became interested in data science. I then did a masters in computer science (MCIT) at the University of Pennsylvania before ending up at my current job in tech :) 📧Contact ======================== youtube: youtube comments are by far the best way to get a response from me! linkedin: https://www.linkedin.com/in/tinaw-h/ email for business inquiries only: hellotinah@gmail.com ======================== Some links are affiliate links and I may receive a small portion of the sales price at no cost to you. I really appreciate your support in helping improve this channel! :)

Оглавление (5 сегментов)

  1. 0:00 <Untitled Chapter 1> 694 сл.
  2. 3:27 Expected Output 551 сл.
  3. 8:01 Check the Solution 508 сл.
  4. 10:45 Pros and Cons 161 сл.
  5. 11:31 Cons 530 сл.
0:00

<Untitled Chapter 1>

hey everyone welcome back to another video so a bunch of you guys reach out to me having watched my past data science sql interview question walkthroughs asking me where i found my questions and also uh you know how i came about getting that dummy data that i used to test out my solution so honestly the way i did it was i just went glass doors and found the questions and just did it manually and in terms of the dummy data i just made it up um thought about the different edge cases and put it onto dvc check this was also the way that i was doing it when i was preparing for my own fang data science interview and honestly it actually did take a really long time um and it's kind of like i don't want to say it's like completely wasted but it's just like not very productive because you're spending your time not practicing the question but trying to find a question and also making sure that your answers are correct so that is why i'm super excited to tell you guys i partnered up with a platform called strata scratch nathan from stratoscratch basically reached out to me saying like hey i have a platform where i curate real data science sql interview questions and then um he puts in all the dummy data so that you don't have to do it yourself and he also posts a solution which he either does himself um or and slash or it's crowdsourced from other users that use his platform and he himself is also a data scientist so you know his solutions are very credible there when he reached out to me i was really excited because what he described you know perfectly addressed those pain points of finding the questions yourself and also creating dummy data to test out your solutions so i was like super excited and i checked out his platform and i have to say that i think it's a really awesome tool so long story short i agreed to be an affiliate for his platform um and by the way i just spent a significant amount of time betting the platform out i went through like a lot of the questions and did them myself and just checked out the solutions as well to see if they're actually correct or not and you know because i feel really responsible for what i promote and i'm pretty picky about it in general i want to only promote stuff that i really see as both really helpful for you guys and it's cost effective so full transparency i do receive a commission if you use the link that i'll be linking in the descriptions below okay so instead of me talking about it i'm going to show you guys what it actually looks like and how it works and for myself um as i'm going to be doing more of these weekly sql interview question walkthroughs i'll be using strata scratch except for the times when i'll be using my whiteboard by the way you should definitely stay until the very end of this video if you want my honest opinion about the platform its pros and cons and who i think would benefit the most from using it so without further ado let's tackle the question all right so this is strata scratch the platform and our question today is a medium question from facebook and it's called number conversation by each user and the question is find out the number of conversations send or receive by each user by date okay and the table we have here is called fb messages and it's id date user one user two and message count so id is i believe the message id the date here is gonna be the date in which it was um sent and then user one and user two user one is sending it to user two i believe and for message count is the number of messages that are being sent okay let's also look at the expected
3:27

Expected Output

output so this is what it's supposed to look like so we have three columns here you have the date you have user one okay so that's going to be the user and you have message count okay so because we're finding out the number of conversations sent or received by each user by date okay so i do have an assumption i'm going to be making here um it's basically the fact that because you're doing send and receive that means that there will be duplicate rows in the sense that you know you can have your date for example here and your user one is to roam 75 and your message count is three so jerome is sending something to someone else and they sent three messages but because we're also doing received which means you might have another row um when jerome responds when it's sorry the person who jerome sent it to responds to him so you know that person sends it to jerome so you have jerome again and has received and then your message count here so they're i'm assuming that um it's okay to have this kind of duplication because they say they want the number of conversations both sent or received by each user okay so those are my assumptions and let's tackle the question all right so first part i will i want the ones that are sent first so get message count for sent and then we want to get message count for receive um and then we want to email these together so adding them together and that should give us this table okay cool let's do the actual query so from fb messages select so we want the date user 1 and message count okay and then we want to get the message count for received so from fb messages select date user 1 oops user 2 in this case and message count okay and then we want to union these together we want to actually use union all over here and not union because union would actually remove these duplicates um and as i was saying from my assumptions before we actually want these duplicates over here um because we want users that were both sending it and receiving it this looks correct to me and let's check it out oh man failure okay um okay a solution is for a solution to be accepted your output should match the expected solution values okay what is not matching here huh you know what i really think that this is correct um okay so find out the number of conversations by send or receive by each user by date okay you know what i feel like it might be because you want it to be ordered by date that's the only thing i can think of um that can potentially be different okay whatever let's just try ordering by date all right why is it not working okay order by a date okay now it worked okay so that does look the same over here i guess like it wasn't actually well it actually wasn't ordered by date because unless you explicitly state it you don't have a guarantee that's order by day so now that we know that
8:01

Check the Solution

let's check the solution which they have here okay date user two message count okay yeah they do order by day as well okay that looks correct um i actually have the same solution as this one so yay good job everyone okay so we got the right solution but we're not quite there yet remember when i was talking about my framework for how to approach each question and finish it and if you haven't seen that video definitely check it out i will link it above where i go over my framework for approaching each question so after you do a question you also want to think about potential ways in which you can optimize the solution and also business use cases in which maybe there's some things that you can do and that you can amend so that it's more useful for business uk use case so in this case remember those duplications i was talking about and has an assumption the question itself wanted these duplicates but if you think about it when you have duplications um it's just kind of you know you just have like another row there and it's not super interesting unless you're interested in exactly how many messages each person sends to the other person how many that messages that person received as well so if you're actually looking at a conversation you might be more interested in the entire length of that conversation how many messages were sent in total so i'm actually going to leave that to you guys think about how to do that and actually comment below um in terms of how you would invent this query to be able to address that situation so just to make sure you know i'm more clear on this for example when you have you crawford four and you crawford one here so we actually want one row for this instead and then it should say eight three twenty you know zero zero um and then you crawl forward and then here it's one and four so we want that to be five so yeah leave a comment below and write that query how you would be able to do this and we're done see how we don't actually have to go and create dummy data by ourselves think about edge cases and then put that onto db sql to check our work it's all done already you know the data says here dummy data is here you can test it out immediately so that is really useful and saves us a lot of time also comment down below how you guys solve this question i'm super interested in seeing your solutions and with any sql question there's so many different ways of approaching that question and i'm really interested to see how you did it all right so now that you guys had a chance to see the platform and see me work through a question using that platform as promised i'll now talk a little bit
10:45

Pros and Cons

more about the pros and cons so the pros the questions are real sql interview questions and that is amazing because you don't have to go on a glass door and try to carry that yourself it's already done for you and a second really big pro is that you don't have to go and create dummy data for yourself and check out your solution and see if it's correct or not also a huge time saver third is that there are solutions and you can use that to verify your solution i think that's really helpful if you're just starting out and you might actually not know how to approach a question and four there's actually python solutions as well so i didn't demo this um in this video because of the sql data science interview walkthrough but it does have python solutions as well and that's super helpful if you're also going to be practicing python for your interviews
11:31

Cons

okay so now the cons i vetted this product pretty thoroughly and went through quite a lot of different questions and i noticed that there is some disparity between questions um in terms of their quality so some questions are really great and other ones you know not as good um so it's this is like a con that you should be conscious of in the fact that not every single question posted is going to be a really high quality question although they do have a really large bank of questions another thing is that the questions can sometimes be a little bit confusing for example the columns might not have descriptions and you wouldn't know how to actually approach that question or like even in this case right like i had to kind of like think about what could possibly fix my query and it was because i didn't write order by date and that's something that wasn't explicitly listed for the question so in the natural interview they will probably actually tell you that they wanted to be ordered by date but in this case you kind of have to figure it out we're like look at the solutions and third there are some minor bugs that i noticed for example sometimes if you write your query and it's actually not correct it would tell you that it's correct so it's really important um to actually verify your solution with the official solution and also see if your query what comes out of it actually matches the output table overall i think this is a really excellent product if you're someone that's really serious about preparing for data science sql interviews or python they also have python and it's a really big time saver in terms of the fact that you don't have to go and create your own questions or make up your own dummy data so yeah that saves you a huge amount of time another thing is i'm actually pretty surprised by how reasonably priced it is so let's see here it's 25 a month or six dollars and 25 cents a month for a year's access or 159 dollars for a lifetime actually i was pretty surprised by how affordable it is and it says here that they also have a seven day refund policy so yeah there you go you can try it out at no risk it's not perfect and you do need to be mindful of some of the points that i listed out earlier with that being said though i have chatted to nathan quite a bit as i was going through this product and betting it out and he's really responsive to feedback and really wants to improve the platform i truly do see this platform improving a lot in the future as he adds more content and just curates it more alright i hope you guys really enjoyed today's data science sql interview question and answer walkthrough and intro to strata scratch if you're interested in scrout of scratch please do check it out in the descriptions below and i'll see you guys in the next video

Ещё от Tina Huang

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться