# A Completely New Way To Build Make.com Scrapers: Hidden APIs

## Метаданные

- **Канал:** Nick Saraev
- **YouTube:** https://www.youtube.com/watch?v=R8o7V39NSSY
- **Дата:** 29.02.2024
- **Длительность:** 27:59
- **Просмотры:** 15,160
- **Источник:** https://ekstraktznaniy.ru/video/12823

## Описание

GET THE BLUEPRINT HERE FOR FREE ⤵️
https://leftclicker.gumroad.com/l/leersa

WATCH ME BUILD MY $300K/mo BUSINESS LIVE WITH DAILY VIDEOS ⤵️
https://www.youtube.com/@nicksaraevdaily

JOIN MY AUTOMATION COMMUNITY & GET YOUR FIRST CUSTOMER, GUARANTEED 👑
https://www.skool.com/makerschool/about?ref=e525fc95e7c346999dcec8e0e870e55d

Here’s one of the most creative and high-ROI ways I build Make.com scrapers. Warning: not for the faint of heart & involves a fair bit of tech. But the end result is a very versatile system that you can build in minutes & sell for thousands of dollars.

BROWSER AUTOMATION SOFTWARE
https://apify.com/

WHAT TO WATCH NEXT 🍿
How I Hit $25K/Mo Selling Automation: https://youtube.com/watch?v=T7qAiuWDwLw
My $21K/Mo Make.com Proposal System: https://youtube.com/watch?v=UVLeX600irk
Generate Content Automatically With AI: https://youtube.com/watch?v=P2Y_DVW1TSQ
MY TOOLS, SOFTWARE DEALS & GEAR (some of these links give me kickbacks—thank you!)
🚀 INSTANTLY: https://instantly.

## Транскрипт

### Segment 1 (00:00 - 05:00) []

what's going on everybody welcome to another video in our course make. com but for people who want to make real money and in this video I'm going to cover an extremely Advanced topic that if I were to guess less than 0. 1% of anybody that uses make. com knows about it's called hidden API access and it's how you build completely automated scrapers that bypass pay walls or bypass uh you know like sign up resources and authentication flows and so this is going to be a really Advanced video if you haven't already seen everything that I've posted on API calls and how to do like a request module and set all that stuff up then make sure you check that out first otherwise most of this is going to seem like magic but if you're already at that point then this video is going to show you an entirely new way to leverage make. com if that sounds like something you're interested in stay tuned all right so first thing I'm going to mention is that this works really well for services behind pay walls nothing thing I want to mention is this isn't actually going to work for all services and so I want you guys to treat what I'm about to show you as like an option and just one of the things to determine pretty quickly when you guys either get a web scraping project from a client or when you guys want to set up some type of web scraper for yourself now if you guys remember from a previous video um when we scraped a website just using the HTTP request module what we did is we sent a request to the website and then we received all of the HTML of the site right and then we used parsers and we used like the HTML DET text module to turn that into something that we could then either use AI to extract or use the text parer module to run through and then isolate things like the email address or the URL all that sort of thing but what if there was a way to call the backend of the service that produces that HTML and then just get all of your data perfectly structured to begin with what if we didn't actually have to do any of the that we had to do in that previous video and what if we could actually go out and then you know call presumably the hidden API like the back end um using the skills that we have well that would obviously make your life a lot easier uh especially when you have like complex signup flows or that sort of thing and that's exactly what we're going to cover in this video this is one of my favorite topics of all time um just because like I showed this to one of my business partners Blue's mind I showed this to um just a bunch of like random make automats and various groups that I'm in blew all of their minds this is just something that like you can use to really take you know your make development to the next level and so what I'm doing is I've set up an example here this is just one of many data resources out there that you can sign up and then you know pay a little bit of money and then have access to basically like uh like a database of listings if you guys remember previously we did like a redin video where we were just scraping the HTML right and so redin is a free resource it's not really behind a pay wall or anything like that what I want to do is I want to set up something that like actually allows you to go beyond um whatever service and then just like call their servers directly we do have to be a little bit sneakier because we're doing something that's not really allowed right it's not something that like companies want you to do because obviously they have their own apis for specific purpose you know and that purpose is for them to deal with not for you to deal with if it's not public and if it's private obviously then we need to be a little bit sneakier but anyway it's still really cool so cy. com is just like a random data source for real estate listings and it essentially allows you just to find a bunch of property records and so you can get like market and Property Data you can get listings you can get all sorts of things so um I've set up a search here for a specific place in Arizona and when I click through it you'll see that there are a bunch of filters that have been applied and then what comes up on the right hand side we have a bunch of like map listings and then the left hand side we have like um just sort of cards that you can click into and if you guys remember this is really similar to what we had for redin right more or less the exact same thing um with red fin you had like the data is a bunch of listings and at the very top of the page you had a map and the map was just pulling data from the listings and then you know presumably pulling some latitude longitude and then using that to populate the user experience now that said um all of this data's got to come from somewhere right and so where is this data coming from if you think about it when I'm like signed into the service and then I click this button right I press enter well the server is or sorry the client which is my browser basically is sending a bunch of requests to the server and the server is looking at that request and then it's presumably going into a database pulling a bunch of records for me based off the filters that I've set up and then sending it back to the server then uses this to like populate some HTML like you know the actual um way that the page looks with the divs and all that and then that gets served to the client so that's you know sort of like a very high level overview of how these sorts of requests happen in practice but anyway so this data is coming from somewhere right and the crazy thing about most of these Services is you can actually find where that data is coming from and then instead of you having to call the server you can just call that resource directly and so the way that you do that is you right click on the page and again this isn't going to apply for every resource so I just want you to use this as like a tool in Your Arsenal next time you're faced with a scraping project but anyway um we we've selected the inspect and then we're going to network here and then

### Segment 2 (05:00 - 10:00) [5:00]

what I want to do is I just want to refresh this page and notice how we got hundreds of requests that are being made right these are all like ongoing requests that the server is currently that we are currently sending the server and you'll see it has random that like trust box view question mark stats question mark aggregate right a lot of the stuff's going to seem like witchcraft but what I want you to do is I want you just to click it through uh click through a few and then poke around and I want you to look for commonalities between of these requests using the headers on the right hand side what are you noticing comes up again and again just as I scroll through these well there is a URL that's https backback api. cory. com so this is the hidden API that this service is using to send and receive data and manage all of its requests which means odds are we can reproduce the request that's being sent and not have to go through the browser in order to pull this data we can actually just send a request directly to this URL up here instead of you know and so what you're looking for basically and a very quick and easy way to do this is go to payload and then what you want to do is you just want to scroll through and then what we're doing is we're looking for uh we're looking for data that comes back I mean I already know uh specifically which network request is important here it's the search Network request sort of makes sense right they're calling it The search endpoint and I know that this is where we're going to get the data but if you don't know specifically which endpoint is to retrieve the data go down to a preview and then just kind of thumb through and just go I mean I'm just holding this down and scrolling as quickly as possible and what I'm looking for is I'm looking for some response data to pop up so that doesn't look like anything of value okay now we're starting to get into responses what you'll find is anything that ends injs is usually something you can discard you don't really have to worry about it because this is just calling JavaScript anything that ends in CSS is just usually a stylesheet it's just something that is used to build the page visually so you don't have to worry about that anything that's prepended with data colon is usually a font so you don't have to worry about that what we're looking for is we're looking for modules like this that say simple words like account or stats or moderated or cell list because these are usually calling an API resource and what we want to do is we want to scroll through all of these and I'm not going to do that all on the video because as you see there are like 400 requests and it can take a fair amount of time once you know what you're looking for though you usually get to figure it out pretty quick once you've found the resource is going to look something like this and as you see here uh should be one of these I don't know which one there we go uh the resource is going to look something like this when you go to the oh this was an issue where you can select the specific time that you want to isolate responses okay what you're looking for is you're looking for something like this where when you go to preview you have a big Json you know JavaScript object with just a giant boatload of listings just like we have here and if I click into each of these listings you'll see that we actually see all of the information that's represented on the left hand side here we just see it in tabular Json so this is the data source for corxy it's this URL https backback api. cory. com assets search and you'll see there are a bunch of parameters over here including authorization which is the most important one and essentially authorization is when you log into a service that service gives you a cookie or a little key and then you can just use that key to byass pass the authentication on future requests so what we're going to do is now that we've isolated that this is the URL that we want to hit we've seen that this is the data that comes back to us when we hit that URL you'll notice that there's another header here called payload and this represents the data that you need to send to that URL in order to get that response so now all we're going to do is we're going to copy all this stuff over into a make module and then instead of doing this in a browser like we're doing now we're going to do it in make. com and then you can set that up to call automatically once an hour once every day stuff like that there's a little bit more Nuance to it than that because sometimes these cookies will expire and in this case it does expire every now and then but there are multiple ways around that and I'm not going to build them out in this video but I'll touch on them just so you guys can see what an actual sort of make. com scraper flow looks like at a higher level okay great so now all that nerdy shit's done let's actually go to make and build something out I'm going to click on this module

### Segment 3 (10:00 - 15:00) [10:00]

type HTTP and then make a request the first thing that we have to do so we have to grab this request URL so I'm going to write https back api. cory. com asset search the second thing we have to do is we have to go down to this request method and you'll see all this information is available in the request and then we have to change it to post because that's what the request method is in the browser the third thing we have to do and what I would recommend you do is just copy all of these request headers down here I'm not going to do these I'll start with accept then I'll open up the header value and I'll say accept application Json text plane then I'm going to go down to accept encoding I'm not keeping the colons in my response so just keep that in mind as you copy and paste your own resource the language here it's English the authorization this is the important one so you're going to want to make sure that you get that one right you see it starts with be and then it's a very long string there's client time zone offset and I'm going to go all the way down I'm going to skip anything that starts with SEC and then I'll grab the user agent as well so this is just7 we're going to add I'm not going to add the content length um simply because I don't feel any of these are relevant or necessary I will add an origin as ky. com and as you're doing this on your own just add in all of the cookies to be all of the headers to be safe I just know which ones work and I want to make sure this videoos in three hours like they usually are last thing we need to do is the user agent which if you recall from a previous video is how we get around producing requests that look like they're human okay great so now we've essentially copied all of the headers in the response there's one other thing we have to copy and that's the payload essentially what we need to do is we need to copy all of this exact as it is and see that there are two ways to do so if you right click you can copy the value or object so I just paste this value in you'll see um that I have an extremely long string you can also copy the object if I paste that in you'll see it looks like it's the same thing uh there's just uh commas now or not commas um I don't know what those are called apostrophes yeah double quotes there we go so I'm going to scroll down here and then I'm going to go to body type click raw my content type is Jason and now I need to put my request content in here essentially so I'm G to click view source and now I can see not the entire thing this is extremely long mind you and you don't actually need to copy in all of this a lot of the time this will just be so unfeasible to do especially uh kxy is a special case because they put a giant list of all of your latitudes and longitudes like this and you see this object is freaking huge right but anyway I'm going to click yes to parse response and then what I'm going to do here is and you do have to be careful when you're calling hidden apis so when you're testing just make sure that you're not doing this all the time every five seconds because the hidden API will cut you out eventually but then now that we have everything ready I'm just going to click run once and see what happens waiting for the server took a little bit you click here and then scroll down to the output you'll see the status code is 200 and then the data variable has a total count of 384 that number looks familiar we saw that in the payload and then the data is now array indexed and it's everything that we wanted in perfect Json which now prevents us from having to do something like build out an HTML detex parser and then build out a specific parser or regx for every single variable we just have everything right here so what would you do with this information well why don't we set up a Google sheet and why don't we make it practical just dump all of these records in to a Google sheet we don't have a spreadsheet do I let me open one on this oh this is pretty annoying Gmail always defaults back to

### Segment 4 (15:00 - 20:00) [15:00]

whatever you used to open this okay and then I'll say example correcty data dump now I'm not going to go through and then make headers for all of this but realistically what you should probably do is you should go to the data that you've received in Json and then create a header for activated on locations brokerage name description has FL blah blah all the way down just so that you know what columns you're sticking that data into but I don't so not going to worry about it and then I want you to notice that this comes out as one bundle and it comes out as a bundle with a variable called data which is an array and so if you guys recall from our previous video some modules have built in iterators but others like the HTTP request module does not and so if we want to access this data and iterate through it then we need an iterator and so we need to turn that one bundle into basically 384 bundles so you can imagine it's going to be pretty operationally intensive so I'm not going to run it in perpetuity what I'm going to feed in as the value to that array is I'm going to feed in this data uh with these square brackets and again always go for the highest level square bracket variable that you can then this Google sheet I'm going to select the example correcty data dump that I just set up here assuming that my authentication is still valid every time that red thing pops up just means that it's expired and you need to redo your authentication I'm going to select all of the sheet information that I normally do and then this doesn't contain headers but it's okay and then I'm going to go through here and I'm just going to dump everything in this locations might be a little bit more difficult so I'm just going to paste in the full address yeah down here we'll do brokerage name we'll do description we'll do has flyer has om has video I ID you can imagine how if you're building this out for a client they probably don't give a about all of the random values here they're probably not going to use I don't know updated on or URL slug but if you consider the cost of including this it's basically nothing so you're almost always better off just doing so and yeah we ran through 25 variables here next what I'm going to do is I'm going to add a sleep the reason why as sleep here is just because I want to be able to cancel this after 10 or 15 because I don't want to do any more than that so I'll just add a one second delay and now let's just go through what we know about the scenario and about what we've learned so far what's going to happen is we are going to call this backend hidden API resource we are going to include in that call a bunch of headers that we've copied over from crai we're also going to include a body to that request which you find a request content with all of the various uh settings that kxi has allowed us to put in which is in this payload tab you can see it's quite long a lot of latitudes and longitudes for this particular API it's not going to be the same for all of them but these are the settings that I suppose are uh being used as a filter in craxi and then what are we expecting as output well if we go down a data we're expecting an array and that's a single array not multiple bundles and so what we have to do with this information is we then have to iterate through the iterator is taking this data that we came in that huge data array and then it's going to produce various bundles one bundle for every item in that array and then we're dumping or mapping the values into a Google sheet with no headers and so we're just going to see the data stuck in on line one but then we are sleeping for one second and then what'll happen is this sleep after it's done with one second it'll go back to the iterator and then start again and so every second essentially it'll be choosing the next bundle to stick into Google Sheets so that's what we expect to happen let's see what actually happens API call went through we received a status code of 200 you'll see records look like they're being dumped in a Google sheet if I go back here that's what's happening and so all of this is data now that is being automatically scraped from this paywalled resource it looks like some of the fields here are strings and so if you wanted to you could also set up a system that takes in as input land let's say this the string D and then maybe splits it based off the presence of that vertical pipe if you want it to separate Acres from the type whatever has all the data that we want every single field has the image has

### Segment 5 (20:00 - 25:00) [20:00]

some Fields here are like price per acre you see that not every field has data in it and you are consuming a fair number of operations because you obviously have to iterate through all this stuff but all in all not bad so yeah that's how you get data from a paywall resource like kxy for instance now I promise that i' touch on a little bit extra here because essentially what we've done this is sort of cheating but what we've done is we've logged into the resource and then when you log into a resource the resource will send you back a cookie or an API key or some type of token and that token is your authentication to be able to use that resource now some apis specifically the older ones which I believe corxy is one of them will allow you to reuse that token for a long time maybe two or three four five six hours some other apis will automatically kick you out after maybe 30 minutes or an hour and so if that happens what you need to do is you need to get another token makes sense right now how exactly do you get another token well this is the part that may be a little bit more advanced there are generally two ways that you can get a token from a resource in order to have the data you need and then automatically uh refresh that token every now and then what you can do in is you can set up another scenario here actually before we do that let's just say you can set up another step to this scenario what you could do theoretically is you could add another request module before the flow that we built and call it get API key or maybe let's just say get token off token what this is going to do and this isn't going to work every time it's going to work sometimes it's not going to work every time but what this is going to do is we're going to log out of this resource okay so I should be logged out now and then I'm going to click log in and then what I'm going to do here is the exact same thing that I did a moment ago go to inspect and then Network and then what I'm going to do is I'm actually going to log into the resource and I'm not because I don't want to show the username and the password and stuff like that but you can imagine how if I did like example gmail. com and then my password was like I don't know like uh example password 890 exclamation point right and then if I looked back on this network request tab what's going to happen when I send the request is and let me just refresh this and kind of start from scratch here we're not going to preserve the log so it's going to start right from here what's going to happen when I log in is it's going to send a request to that resource and that just happens somewhere around here maybe a token there you go you'll see here that it includes the username the password in the payload and then it actually goes and it calls this URL here api. corcom token the only job of this URL of this API endpoint is to generate a unique token based off of whether or not a username and a password exists in the back end in the server you see in my case none does because obviously example of gmail. com with example password 890 probably not a real account but what'll happen is inside of that will be a response and inside of that response will be a variable called token and so what you could theoretically do is we go through all the same steps that we just did where you go to the headers you copy everything over you copy this URL resource into this you know get off token thing you go through and you uh change the payloads that there's a grant type a username a password there're all these fields and then every time you start the scenario which may be once a day let's say maybe we do it every day at like 717 p. m. it'll log into the resource for you it'll retrieve a token and then what you can do is you can just use that token as a variable and then stick it into here this authorization and there'll probably be a couple of additional steps you have to do some of these need to be formatted in a certain way some of them you need to add the word Bearer in front of it you know they're just some minor things that change from API to API but in that way you basically just log into the service once a day grab the token and then use that token to make the request to the specific hidden API endpoint that you want so that's probably the simplest way to do it that said this way does require you to log into the resource every day and if you log into the resource at the exact same time of every day and if you're very consistent with this sort of thing for a long enough period of time sometimes they have spam detection measures that will eventually find out that you're using a robot to do this and it's not a human being the other way to do it is you have an external data source like a Google sheet and then you have maybe like a date retrieved column and then you have a token column excuse me and then what you do is you use this

### Segment 6 (25:00 - 27:00) [25:00]

data source so you build another scenario so we build another scenario here we'd call it um let's get example builds we'll call this let's say like retrieve um token from whatever the service is we'd feed in this get off token module that we're pretending that we just set up and then you'd add that token to a shared data resource like Google Sheets and so we go we' select this example Correy data dump and then you'd have a field like date retrieved so you can constantly monitor when the most recent one is has been retrieved and then whatever the specifics of that API are where like maybe the token expires after 17 hours or something like that um every 17 hours you actually go and you call that resource using this scenario you dump the token into this Google sheet and then back in the original scenario you actually retrieve that token from that Google sheet entry every time you run the record so the first module here would actually be probably get a cell instead of um get off token and then what you do is you get that cell that specific record that's called token and then you would use that as the input into this authorization Bearer whatever so those are the two main ways to do it there's another way to do it of course you can use what's called browser automation is pretty Advanced you do need to know how to code most of the time anyway and there are a bunch of services out there that allow you to build out essentially these little crawlers that will physically go into a website pretend they're using a browser and then click on the username password and login field then you can retrieve the cookies or the tokens that were um sent to received and then you can use them to fill in a Google sheet or a shared resource that you can call and so I mean I'll add a link down below to the resource that I usually use for this sort of thing it's definitely a skill that's worth knowing and worth having but it's quite difficult if I'm being honest if you don't have any coding knowledge and it will take you a little bit of time to get up and running regardless some of the highest Roi skills that you can have so I hope you guys enjoyed this video on hidden apis using make in a completely new and hopefully quite uh valuable way and I want you guys to know that I use this sort of approach all the time for my own client to build scrapers out that realistically only take me 20 or 30 minutes to build whereas it might take somebody else like 5 hours to build in code so very high Roi sorts of flows that you can build here thanks so much for watching if you guys have any specific questions that anything that I talked about in this video or if you have requests for future videos on make. com or something else please let me know I love the feedback and these requests are what have allowed me to um decide my content calendar for the month leave a like subscribe tell all your friends hopefully and I'll see you on the next video thanks so much
