Data Science SQL Interview Question Walkthrough (real interview style) | SQL Sundays #8
7:07

Data Science SQL Interview Question Walkthrough (real interview style) | SQL Sundays #8

Tina Huang 28.10.2020 10 723 просмотров 268 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Hi friends! Welcome back to SQL Sundays! A weekly SQL data science interview questions and answers walkthrough. This is another data science SQL interview question and answer done from beginning to end in a real interview style. 🔗Affiliates ======================== My SQL for data science interviews course (10 full interviews): https://365datascience.com/learn-sql-for-data-science-interviews/ 365 Data Science: https://365datascience.pxf.io/WD0za3 (link for 57% discount for their complete data science training) Check out StrataScratch for data science interview prep: https://stratascratch.com/?via=tina 📲Socials ======================== instagram: https://www.instagram.com/hellotinah/ linkedin: https://www.linkedin.com/in/tinaw-h/ discord: https://discord.gg/5mMAtprshX 🤯Study with Tina ======================== Study with Tina channel: https://www.youtube.com/channel/UCI8JpGrDmtggrryhml8kFGw How to make a studying scoreboard: https://www.youtube.com/watch?v=KAVw910mIrI Scoreboard website: scoreboardswithtina.com livestreaming google calendar: https://bit.ly/3wvPzHB 🎥Other videos you might be interested in ======================== SQL Sundays Playlist: https://www.youtube.com/playlist?list=PLVD3APpfd1tuXrXBWAntLx4tNaONro5dA How I learned SQL from Scratch in 11 Days to Pass my FANNG SQL Interview: https://www.youtube.com/watch?v=vaD3ZFFNwhM How I consistently study with a full-time job: https://www.youtube.com/watch?v=INymz5VwLmk How I would learn to code (if I could start over): https://www.youtube.com/watch?v=MHPGeQD8TvI&t=84s 🐈‍⬛🐈‍⬛About me ======================== Hi, my name is Tina and I'm a data scientist at a FAANG company. I was pre-med studying pharmacology at the University of Toronto until I finally accepted that I would make a terrible doctor. I didn't know what to do with myself so I worked for a year as a research assistant for a bioinformatics lab where I learned how to code and became interested in data science. I then did a masters in computer science (MCIT) at the University of Pennsylvania before ending up at my current job in tech :) 📧Contact ======================== youtube: youtube comments are by far the best way to get a response from me! linkedin: https://www.linkedin.com/in/tinaw-h/ email for business inquiries only: hellotinah@gmail.com ======================== Some links are affiliate links and I may receive a small portion of the sales price at no cost to you. I really appreciate your support in helping improve this channel! :) Music Credit: TheFatRat & Laura Brehm - We'll Meet Again https://www.youtube.com/watch?v=hJqYc62NCKo

Оглавление (3 сегментов)

  1. 0:00 Introduction 296 сл.
  2. 2:07 Approach 356 сл.
  3. 4:55 Solution 361 сл.
0:00

Introduction

hey guys this is sql sunday number eight let's jump into it all right so today's question is from google called activity percentile find the email activity percentile for each user email activity percentiles defined by the total number of emails set the user with the highest number of emails sent will have percent help one and so on i'll put the user total emails and their activity percentile and order records by the total emails in descending order okay let's take a look table so google gmail emails id from user to user as well as the day okay and the expected output here is going to look like user total emails and activity percentile okay let's take a look at the table here okay so it seems pretty standard this is pretty much what i expected so it's zero from user to user and we have the day over here so we want to get the user the number of emails that they sent and as well what percentile it is that they rank cool so let's look at the expected output okay so this is a little unexpected on my part um when they said activity percentile i assume that the end tile here would be you know what percentage or what percent how that person is in terms of their total emails but actually seems like it's just the ranking in terms of total emails um let's see here okay it looks like it's just total ranking and also the ranking so you have 1919 but ranked one two so it's you know ranked like where you want each number and just because they have the total number of emails is not ranked the same cool all right so it's assumptions up
2:07

Approach

front uh i kind of stated it already it's the fact that if you have total number of emails then you would actually be ranked different ones as we can see over here um i don't think i don't think i have any other assumptions to make um let's see yeah looks about right all right let us write our approach so we're gonna say here we're going to group by from user and then we're going to count number of emails so that would just be count and then after that we will get the rank so it says activity percentile here but we can actually get the row number window so we're going to use a little nifty little window function here all right so let's actually approach this question so from google gmail emails so i had our select statement from user i want to do the count oops count as total emails and for our function a window function so i'm just going to write this down i'll explain it a little bit as well what this means so row number over order by um count here and we want it in descending order and we'll call that and tiles okay so row number here actually means um we're ranking it we're ordering it by the count which is the total number of emails by descending because we want the total emails that are the highest to show up first and we're getting the row number of that so it's just ranking it from one all the way up to however many people that there are and because we want it so that everybody gets a different number despite the fact they have the same total number of emails that's why i want to use row number as opposed to rank functions all right cool and we have to group by one that should work so let's see uh shall we check it i think this looks right to me okay cool yep yep this all looks good to me cool looks like we got it right um so in terms of
4:55

Solution

anything i can do to actually make it more efficient oh actually let's check out their solution first okay so they did a sub query from user count and then grouped it over here and they used n tile so n tile also works it's just another function um that you can use to rank it in this case so yeah you should get the same results using end tile both would work cool so it looks pretty much the same here all right yeah so in terms of efficiency um i think that my approach might be a little bit more efficient because i don't do a sub query but except for that i think it's pretty similar i also don't really see how we can improve it that much more as always let me know in the comments below what your approach is especially since we've never seen a window function before so i would love to see how you guys think about what that um and if you also use the window function if you didn't use a window function i'm also really interested to see how you did it then okay so in terms of improvements to make um i don't in terms of like what we can do after this so since we're actually looking at the ranking i think it would actually be useful to go ahead and also find the actual percentiles of these people and grouping it together so what percentile each person is and what they belong to so last homework question for you guys why don't you also modify this function so that we can see the percentiles for each of these um users in terms of their total emails let me know what your solution would be in the comments below all right good luck everyone on their sequel data science interviews if you've been following the series all the way from number one to number eight here if you follow through and approach the questions like i suggested i am very confident that you'll do super well on your actual interview alright see you guys in the next video

Ещё от Tina Huang

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться