This code looks pretty clean, right? There is a nice interface here. We're using a protocol. There is a container. Uh we can call a method and use different options. In this case, this is about exporting some sales data. So, it all looks great. But when you scroll up, you see that behind this there's actually a huge mess. We have container that has like a bunch of settings uh hardcoded in it. Uh it passes along these settings directly to some report surface. The report surface then again gets all of these settings and stores them. There is a run method that basically contains everything. The problem here is that this code actually cleaned the wrong thing and that led to a bad design. Today I'll refactor this code and I'll show you why clean code can lead to worse designs and how you should avoid that. This video is sponsored by Hostinger. More about them later. What do people mean when they say clean code? A lot of people kind of turn it into a checklist, right? You need small classes, short methods or functions. Everything needs to be abstracted away. You need to apply design patterns. And if you just apply those rules blindly, you often get over decomposition. You know, too many pieces for a small problem, a design that looks tidy on the surface but makes changing the actual behavior still really hard. Uh in this case, the problem is not the intention. The problem is that you are optimizing for smallness instead of cohesion. Now let's take a look at this code. So there is an abstraction which is the report service which has a single method. So in a sense that is nice and then this report service is created through a container. So the designer of this code probably thought hey we need a container so that we can do dependency injection. But actually when you look at the code itself, the container doesn't really do anything useful, especially since we already have a report surface object that we can inject directly. So there is abstraction here, but it doesn't really solve a problem. On the other hand, if you scroll up, you see that the actual report service is still very complicated. It gets all of these settings and there is like the run method which is really long and tries to do all of these different things. So there is abstraction patterns and principles in this code but it's in the wrong place and that makes the design worse because it isn't done in alignment with the actual behavior. So when you organize your code this is not about making your method smaller or adding more abstraction everywhere. You need to understand the reason to change the cohesive reason of change and organize your code in that way. In this script which is about exporting sales data, there's one reason to change which is how you define and export the report. And by the way, if you run this code, then you can see this is what it actually results in. So for a particular country, it counts the number of sales and adds up the revenue. And here it writes it to the console, but it also wrote a report. And this is what that report looks like. Here's an example of a sales CSV that this program load. So that's what this does. And then there is some filters and settings like the country, the minimum revenue, whether we should take refunds into account, etc., etc. So when I refactor this code, the goal isn't fewer lines of code. It's to make this pipeline of settings and exports more visible, group things that change together, and keep the behavior understandable and easy to modify. Now, before I start refactoring, at some point this little CSV script may become something you want to run somewhere, maybe as a recurring job or as part of a bigger system. And that's where today's sponsor, Hostinger, comes in. For projects like this, a VPS is often the sweet spot. With Hostinger's VPS, you get full root access, your own dedicated IP, and you can choose from multiple Linux distributions. Their servers run on NVME SSD storage and AMD Epic processors. And with KVM virtualization, your resources are isolated, so performance stays stable. Right now, they're running their New Year's resolution sale. Pricing is really good, starting at only $649 a month. Now, the KVM2 plan is perfect for a project like this. When you choose this, you get 63% off. And on the 24 month plan, you get 2 months for free. It's a really strong price toerformance ratio and you know exactly what you're paying for. There's no surprise cloud bill at the end of the month. When you add it to your cart, don't forget to use my coupon or encodes that gives you an extra 10% off. And once you're in, setup is really straightforward. Just choose your server location, pick your OS, and you're ready to go. You can deploy something like this Python script in minutes. And you also get automatic backups and 24/7 support. Use the link in description. Get started with Hostinger today. And don't forget to apply the coupon. Now, let's clean this design up. The first thing that I'm going to do is remove some of these classes and abstractions that we
Segment 2 (05:00 - 10:00)
actually don't need. So, I'm going to start with the container and directly create the report service instead in the main function. So, I'll copy this over and then here I'm going to simply create the service like so. And then of course we now need to store these settings here. So the country is the Netherlands. Encoding is UTF8. The limiter is a comma. This is false. And this was 10. 0 like so. So now we no longer need to do this. Let me run this just to make sure that this still works. And it does. And that means that now we can remove this container and we can also remove the report service protocol. So that already shortened things. The next thing is that the report server is actually a single method. So we can also turn this into just a function. The only problem is that there's currently a lot of arguments that are being passed. So there is like five here. Maybe we'll add more settings later on. And the run method itself also gets a source and a target. So instead of putting all of this information in the report surface class and then having a run method, we can also introduce a report configuration which is then a separate object. And that's a good example of where we're looking at the difference between trying to make things small versus cohesive. So the report config as a concept makes a lot of sense. You can also see that in the current version, all of these config settings are not explicitly shown. They're sort of passed as arguments, but we're not really clear what they are. If we turn that into a config object, this is much cleaner. On top of that, a config is something that you want to change in different places in the code or maybe you want to have multiple configs. So, turning that into class makes a lot of sense. So, let's create a data class report config. And we can make this frozen. Inside the report config, we're going to have these settings. Just going to copy these over. And we can now also choose sensible defaults. So in particular, delimiter and encoding. That's something you want to define by default. So I'm just going to put these at the bottom. So delimter is a comma and encoding is going to be UTF8. We can also set allow negative to false and you can even choose a country and minimum revenue setting. Now I'd say these are probably things that you will regularly want to change. But just for the sake of completeness, let's make the minimum revenue 10 and make the default country the Netherlands. Like so just like we had in the before version of this code actually. So now we have a report config and then what we can do now is instead of having this default report service object we can turn that into a single function. Then I'll work on the function later on. So let me just deindent this and I'm going to remove all of this and instead of passing self we're going to pass a report config like so. And then anywhere we have a self dot this needs to become config like so. There we go. I think we got everything. And actually maybe it makes sense to put the config as the last argument like so. Now of course we need to update the main function. So in this case uh we're going to create the report config like so. And all of these defaults are already here. So I'm just going to create the instance here. And then we're just going to call run. And we're going to pass the config like so. Let's run this. Make sure it works. And it does. So now that's already a big change. The nice thing now is that in the main function, we have access to the configuration because we create the object here. That makes way more sense than what we had before. And you can now easily change things here as well. Let's say I put min revenue to a higher number like I don't know 50 or something. And then if I run the code again, you see that the revenue is now lower because we're ignoring basically anything below uh €50. So this gives us a lot of
Segment 3 (10:00 - 15:00)
control. It gives us access to what we need to change while still taking care of cohesion in a sense that the configuration settings are all together in one object. So it's not about making everything small. It's about thinking about what you need to change where and what belongs together. Next problem like I mentioned is that this run function that we have now basically contains the entire pipeline. But if you look at what is happening here, we actually have three things. We have loading which loads the CSV file. We have uh some validation, filtering, aggregating. That's basically the summarize part of the job. And we have the export which is where do we save that data? Is that a file? Is that standard out? Is that something else? So now instead of putting everything together, we can also make that pipeline explicit and more controllable from the outside. So let's split this up. So here we have the first part which is about loading. So I'm simply going to copy this whole part of the function and then I'm going to call this load data. Now load data of course needs a source. It doesn't need a target anymore and it still needs the report configuration because that contains things like the delimiter and the encoding. And then what we're going to do here, you see that we turn a source string into a path object. Instead of that, we can just directly pass a path. And that makes the function even a bit shorter. And then this of course needs to be the source like so. And here we're going to open the source. And then we simply return the result like so. And then of course what this needs to return is a list of some dictionary of things. But actually let's also define that more explicitly. So I'm going to create a type alias data and this is going to be a dictionary from string to string. So here I return a list of data like so. And this comment is no longer needed. So now we have our load data function. And then what we can do here is pass the data instead of the path. Like so this whole part is no longer needed. And we just need to fix the names like so. And then of course now in the main function we need to actually load the data. like so. And then I also need to change that here. Let's run this again and see whether it still works. And it does. Now, another nice thing, another really nice effect here is that now again we have more control in the main function because we decide when and where we're going to load the data. And because we do that here, we now only load this data once and then we call the run function twice with that same data. Whereas in the before version, we actually load the data twice. So that's even an efficiency improvement here. So it gives more control and it's more efficient. Next, we can organize the run function even more. For example, take a look at this summary object that we create here. And then we have some export options. So uh here is for example a text version that we create of the uh summary. But we can actually clean this up by making turning a summary into text part of the summary class which is where I think it belongs. Now the only thing is that currently the summary object contains count and revenue sum but actually when you look at the text summary also should contain the country because that is the actual summary. So instead of doing it like this which is kind of messed up, let's update summary so that it also includes the country. like so. And then of course here we also need to store that like so. And then what we can do next is take this and go to our summary class and then create a method to text. And then we're simply going to return this. But of course this is going to be the selfc country self. ount and self dotrevenue sum like so. And now no longer need to do this. We simply do summary dot to text. And here the same thing like so. The nice thing now is that we can optionally add formatting options
Segment 4 (15:00 - 20:00)
here and that is going to be part of the summary and then export functions can use this and that's where we can split out the next thing which is the export functionality. So here there's an if else statement depending on the target we're going to do something else but this is also over complicating things because actually we need the control here in the main function we want to say okay export it to there or do this or do that and now we're kind of passing along this information and then splitting the decision here. So instead of doing that we can simply use different functions for that. So let's create two functions. The first is export standard out and this is going to get a summary object. And that basically does this. And then we have export file which gets a summary and a path. And that's where we copy over this. And again I use path here directly as an argument. So there's no need for conversion. So this line is not necessary. The indent is as you can see there is a config encoding here. So you can choose to optionally pass the report config here. Uh you can also leave it out. decide to simply pass the encoding as an argument with a default value. And sometimes I like to do that if it's just a few settings. So here, let me just pick an encoding which is a string. And let's say we put that default to UTF8 and then we pass it along here. So now you still have the control but by default it's going to have some sensible value and we don't need to pass the report config to the export function which I think is also a nice thing. It decouples things a bit more. So we have our two export functions. And then actually what we do now is this part is not going to be a part of our run function anymore. This is simply going to return the summary like so. And then what we can do here is actually call that. So summary is run the data. And then of course what we also don't need anymore in that run function is the target because that's handled by the exporter. We just need the data and the config. Let's go back here. So there we go. We have our summary. And now finally we can do the exports. So let's call export standard out. We simply pass the summary and we have export file. We report txt. And of course, this needs to be a path object. Like so. I'm messing up here. Okay, there we go. Let's run this. Yep, still works. And again, this gives us way more control in the main function because now we're not only controlling when we load the data, we can also control when we create the summary. And again, we don't have to do that work twice. we simply do it once and then we export it twice. So this is even more efficient than what we had before. The final thing is to organize the rest of the code a bit. So first naming this is currently still called run but it doesn't really run anything. It creates a summary. So let's just simply call this summarize because that more correctly reflects what this is now actually doing since it no longer loads data or exports that data. This is also a comment that's no longer needed. Now what we can do in the summarize function is actually organize this slightly differently. As you can see, it now keeps track of account and revenue sum and it goes through each of the rows in the data and uh there is some uh business logic checks here. Let's split up the work of the business logic. So all of these checks, that's basically the filter and actually summing the revenue and creating the summary object. So what you can do is create an internal function called is valid which is going to check whether a certain row is valid for our report and this is going to get a particular row of data and it's going to return a boolean value. Here are in essence the condition. So we can just copy this over. Let's put it here. And then let's also put this here like so. And then now we're going to return false in the special cases.
Segment 5 (20:00 - 25:00)
And in the end of course we return true. This is our filter function. But now the rest of the code becomes way simpler cuz we can now create a valid rows using a list comprehension like so. And this is not called rows, this is called data. And then in the same way we can create the revenue sum. I'm missing a parenthesis here and here. I have a couple too many I think. Of course I need to do this. Let's also rename this And let's put a default value of zero just in case. Now let's remove this for loop because that's no longer needed. And then we're going to return the summary. This is going to be the length of the valid rows and the revenue sum we computed right here. So then this is what we get. Let's run this. Make sure that it still works and it does. So now we took another step in improving cohesion in a sense that we now grouped all the business logic that detects whether a certain row of data is valid and that simplifies things even more. So in the end we created a new class for the configuration because that belongs together. We add a type alias. We expanded summary so that some of the behavior like converting it into text belongs there. And we split up the behavior the pipeline into explicit functions loading the data summarizing it and exporting it somewhere. And then in the main file that gives us a lot of control because now for example let's say we want to add a new type of export. Let's say we want to export this to JSON. Well then now there is a logical way of implementing this as part of our script. For example here we might want to add a to JSON method now so that we can turn the summary into a JSON value and the return type of this well there we can also use a type alias again. So let's create a type JSON which is also a dictionary. It's not really a dictionary. It can also be a list, but let's just keep it simple here. From string to let's say string int float. Again, this is not a complete JSON spec. You'll probably want to add a few other things here as well, but for now, this is good enough. And then here to JSON is going to return a JSON object. So, we're going to have the country and revenue sum. And yes, you can directly turn a data class into a dictionary by using asdict. This gives us a bit more control over what we include in the JSON and uh how we're going to name things. So, there's a trade-off there. So we can now turn summary into JSON format. And then what we can do is have an export JSON function. And for that we're going to import JSON like so. And then create the function. So this is going to get the summary as usual. We'll also need a path. And first we create the JSON data. And then we're going to write JSON dumps the data with let's say an indent of two. And then let's copy over this print log. There we go. And now we can do export JSON. It's very easy. Summary. And we add a path report. JSON. That's all we need. Let's run this. And let's also see what it generated. Now we have our nice JSON file. So in the before version of the code, this would have been very complicated. A, we wouldn't have even have access to this part of the pipeline because that was all in the container. And even if we expose that, we still needed to pass all of these arguments everywhere. Whereas now, it's just simple. Everything is in its right place. You understand what the problem does in like 10 seconds. There's no containers or services, just the
Segment 6 (25:00 - 26:00)
pipeline. It's really easy to modify. And by the way, if you enjoy videos like this one, give it a like and subscribe to the channel. This helps me reach more people here on YouTube. I really appreciate it. So in short, clean code is not about keeping the number of classes limited or keeping everything small. It's about truly understanding what things are related and make sure they live together so that you have high cohesion. Think about visibility. In this case, we wanted the pipeline to be explicit so we can change things. Think about where meaningful boundaries are. separate things only when it clarifies behavior instead of hiding it like we had at the start. I'm not saying to never use abstraction. I love abstraction. It makes sense, especially when you need to swap implementations. But just adding it randomly everywhere is not going to make your code clean. But I'd like to hear what you think. Have you ever worked on a project where everything looked clean, but then you opened it up and there was a huge monster in the closet? How do you decide to structure your code and where to place boundaries? Let me know in the comments. Now, a great example of applying clean code principles in the right place is value objects. If you're not using them yet, you have to check out this video. I explain exactly what they are, how they can improve your code massively. Thanks again to Hostner for sponsoring this video. Thank you for watching and see you next