7 Things You Didn’t Know Dataclasses Could Do

7 Things You Didn’t Know Dataclasses Could Do

ArjanCodes 41 887 просмотров 2 043 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
💡 Learn how to design great software in 7 steps: https://arjan.codes/designguide. Dataclasses are often treated as nothing more than a shortcut for generating init methods, but there’s a lot more going on under the surface. In this video, I walk through several lesser-known dataclass features by gradually improving a simple example and showing how small changes can have a big impact on correctness, safety, and design. If you think you already know dataclasses, there’s a good chance you’ll learn something new here. 🔥 GitHub Repository: https://git.arjan.codes/2026/dataclass. 🎓 ArjanCodes Courses: https://www.arjancodes.com/courses. 💬 Join my Discord server: https://discord.arjan.codes. ⌨️ Keyboard I’m using: https://amzn.to/49YM97v. 🔖 Chapters: 0:00 Intro 0:54 1. A Safe Default Field 2:33 2. Derived Fields 4:54 3. Dataclasses Are Still Classes 6:27 4. Frozen Dataclasses 9:04 5. Slots, Ordering, and Keyword-only Arguments 11:20 6. Custom Constructors 13:55 7. Serialization Helpers 15:06 8. Abstract Dataclasses 19:53 Final Thoughts #arjancodes #softwaredesign #python

Оглавление (10 сегментов)

Intro

Most developers know the basics of data classes in Python. You define a simple class with some typed fields and then Python generates the boiler plate. Here you see an example of that. I have a user class. It's a data class. A user has a name and an email address. And because it's a data class, I can now initialize it as following. And I can simply print it. And it's going to print out the information in a readable way. Here you can see the result of running this script. But data classes can do way more than that. So today I'll show you seven things you probably don't know about data classes. Each of which makes your code safer, clearer, and easier to maintain. And the last one tends to surprise people. If you want to learn how to design software from scratch, grab my free guide at iron. code/design guide. It walks you through my sevenstep process for designing new software. Link is in the video description. The first

1. A Safe Default Field

thing that's really easy to do with data classes is default values. For example, if I have an active value that's boolean, I can by default make this true by simply assigning it a value. Then of course, if I run this now, each of these users are going to be active by default. However, since it's a default value, I can still override this and make one of these users false like so. So then we get this as a result. Now defaults are sometimes a bit complicated in particular if you're dealing with more complicated types. Let's say you have a tag which is a list of strings. Last Now you might think you can do this. However, this is problematic because actually this is an empty list and that's generated only once when Python runs the script. And if I run that now, you see actually data class protects us against this by raising a value error. So instead of doing this, what you need to do is define a field which you also need to import from the data classes module. And then we're going to define a field. And we're going to give it a default factory of a list like so. And now when I run this again, you see that this works as expected. Now there's a type error here. And actually the issue is that this is a list of strings, but we actually specify the default factory as just a list. So if you want to be precise, you also need to specify the default factory as a list of strings. Doesn't really change anything the way that Python works, but it's more precise in terms of typing.

2. Derived Fields

The second thing you can do with data class is that it allows for derived fields. These are fields that shouldn't be passed into the constructor. You can mark a field as not being part of the generated initializer and this lets you compute it later typically in the post initialization step and this particularly useful when the value should be stable and not recomputed every time uh unlike for example a property that recalculates on access. For example, let's say when we create a user, we want that use to have a slug which will point to a URL. Now, typically this is something you would create only once per user and you don't change it if the user information changes because then the URL doesn't work anymore. So the slug that's going to be a field, but we're going to set that as init false like so. So now this slug value is not something that you can set in the user initializer. But what you can do is define a post init method. And this is a method that is called right after the object has been constructed. So what you can do here for example is let's say we have a variable slugified and that's going to be the name dot let's convert this to lowerase and let's replace all the spaces by dashes like so. and then self dot slug equals slugified. Now you could set this directly. There's a reason why I'm doing this in two steps. That become clear in a second. A post in it by the way is also really helpful to clean up some stuff. So for example, I could do u something like this as well. And let's also change the name to title case like so. And now because here my name is lowercase. If I now run this then as you can see now the name is properly formatted. And also we have our little slug here. And if I change the name to let's say Alice Smith like so. And run this again. Now my slug is Alec. Did I actually write that? Oh okay. This is what I meant. So now you can see that the slug also is properly constructed.

3. Dataclasses Are Still Classes

The third thing is to not forget that data classes are actually still classes. It's not just data. You can add methods, properties, helper functions, derived attributes, everything that a normal class can also have. And that means that if at least you're being careful, your domain logic can live right next to data that it belongs to. For example, you could define a method, let's say contact card that will return a string, which is let's say a formatted string containing the name and email address like so. And now I can print the contact card for a particular user. And then this is what we get. Or you could add properties as well. Let's say we want to know the domain belonging to the email address. So in this case, we're going to take the email address. We're going to split it by the at sign and then we're going to take the last part. And since it's the property, I don't need to write parenthesis. Let's run this and see what happens. And there we go. We have the domain of that particular user. In short, a data class is just a regular class. So you can still add these sort of things. Of course, be careful. Don't turn this into a god class, but this is still something that you can do. Fourth thing is that

4. Frozen Dataclasses

you can have frozen data classes. And that doesn't mean they're cold, but it means that the instance is immutable. and that improves safety by preventing accidental changes. Now, under the hood, this just blocks all assignments to attributes after initialization. So, in the case of the user, what I can do is turn this into a frozen data class like so. However, now you see that there's a problem in that we're trying to assign to attributes in the postinit method and that is now broken because if I now run this, you see that we get an error. So if you have a frozen data class, you can actually still do things like this, but you need to be a bit more explicit about it. So first let's just store this into a normal local variable like so. And now instead of directly assigning which is not allowed, we can do object dot set at. And we're going to set the name attribute to normalize name. And we will set the slug attribute to the slogified value like so. So now when I run this again, this works as expected. Now, as you can see, this is kind of annoying that we have to call it like this. I'd prefer actually if a frozen data class would for example still allow you to set the value in the postinit method but that would actually change the syntax of the Python language and data class is simply a module that is built on top of Python. So that simply doesn't work. However, as you can see this is still possible to do. Now one thing to remember is that immutability applies only to attributes themselves not to the contents of mutable objects that are stored inside them. And that means that a frozen data class with for example a list of tags is still uh vulnerable if you mutate the list later on. For example, if I take my uh user, let's say user two, I can take my tags list and I can simply append a value to it. And that is perfectly fine because I'm not mutating the object. I'm muting the list of tags inside of it and list is a mutable object. So here you see I've now added this tag. So that's something to be mindful of. Frozen doesn't mean that everything in a data class is read only. It only applies to the attributes. But

5. Slots, Ordering, and Keyword-only Arguments

there are more options that you can add to data classes. For example, ordering which makes them comparable. So you can sort them or use them in priority cues without writing that comparison logic yourself. Now the thing is by default let me just remove this for now. Let's say if we want to print uh u1 less than u2. So we're trying to compare these two and I try to run this. It doesn't work because this is not supported between these two instances. But what you can do is add an order equals true option to the data class. And now you see also the error disappears. I'll try to run this again. And now we actually see that user one is less than user two. So this actually works. So it's really easy to add this kind of behavior with data classes. Another thing you can do is set slots to true. And this actually improves memory usage and speeds up attribute access by removing the instance dictionary. That also means that you're not supposed to add attributes on the fly. Then it's not going to work correctly anymore. But in most cases, you're not doing that anyway. So setting slots to true is a really great way of making data classes containing lots of data a lot faster in terms of uh attribute access. And as you can see this still runs in exactly the same way. Another thing you can do is make fields keyword only and that means that you are limiting the way that the initializer is being used. For example, now I can't define a user simply by passing the arguments in order to the constructor. If I try to run this code, this is actually uh not going to work because it's not what the initializer expects. So, uh what this does is this forces you to use these keywords argument names which makes defining users in this case a lot more clear. Now, we know this is the name. email address, but it can still help with readability in certain cases. And of course with keyword only arguments doesn't matter what the order is of the arguments and makes object construction overall clearer more explicit. So the result of making these type of changes is that you have data class that's faster. It's safer to use and it's easier to maintain. Now it's

6. Custom Constructors

possible that next to the initializer that the data class generates, you also want to have other constructors. And because it's simply a class, you can add custom class methods that act as alternative constructors. actually do this all the time and these allow you to create objects from domain specific inputs. For example, let's say you want to generate a user from just the email address and that the name is basically extracted from the email address. So what you could do in that case is have a class method and let's call that from email. So that gets a class and an email address. That's the only argument that it gets. And this is going to return self. And self is something that we import from typing. And now we first create a name from the email address. So we do email. split. And I'm not doing a lot of validation here. I'm just assuming that this is a valid email address. If you want to make this more robust, you should add a validation. And we're going to take the first part. And let's say we replace the dot by a space. So we get first name dot last name and we capture that like so. And then what we do is return class where we pass the name and we pass the email and this should actually be local like so. And now what you can do is create a user from email. Let's say we have John Smith at the very famous website example. com. And we're going to print user three. Let's run this code. And now you see we have our user with name, email address, and it has all the other things set by default as well. And the only thing I did is add a class method that allowed me to construct a user from an email address. Now, like I said, you can also do this with regular classes, but I typically like to do this with data classes where I want to have various ways of constructing the object because you can imagine that you add more class methods here that construct users in lots of different ways. So this is a really clean way to allow for more flexibility with object construction while still relying on everything else that data classes offers. The seventh

7. Serialization Helpers

thing that is a really nice feature is that there are serialization helpers. Data classes provide builtin ways to convert objects into dictionaries or tupils and these are part of the data classes module. So we have as dict and we have as tupil and then what you can do for example is call as dict on let's say user one and we can print that or we can call as tupil on let's say user two and print that as well and when we run this then this is what we get. So the first thing we have here is a dictionary that's a representation of that particular user and here we have a tupole containing simply all the values of that second user. These helpers are really nice for logging, testing, structured data processing, or preparing data for JSON serialization. They give you a predictable, stable snapshot of an object field. Now, before I show you the final thing, which is something really cool that you can do with data classes, if you're enjoying this video, please give it a like and subscribe to the channel if you want to watch more of this type of content. It helps me more than you think and it keeps the channel growing. Now the final thing I want to

8. Abstract Dataclasses

show you is that data classes can be combined with abstract base classes. Yes. And then this lets you define shared structure and required behavior while still generating constructors automatically. And this is actually really neat. So from ABC I'm going to import ABC and abstract methods. And then let's say we create an account data class. And account is an abstract base class. Account is going to have an owner. And let's say it's going to have some sort of base fee which is a floating point. And then we can have an abstract method which is the monthly fee. And that's going to return a floating point value like so. And actually we can even turn this into a property. So now we have an account. It's an abstract base class. What I can do now is create another data class that's called free account which is a subclass of account. And this implements the property monthly And for a free account that's simply going to be zero. But we can now also have a premium account. Like so which is also an account. But premium accounts have let's say extra storage gigabytes integer. Let's put that to 100 by default. And there we also have the monthly fee property. But here of course it needs to be slightly different. So let's say this is going to be self. base fee plus self let's use parenthesis self. extra storage times 0. 10 10 like so. So now I have my two account types. And you might think that this completely breaks because we have a data class here that has two of these uh attributes. Free account doesn't have any attributes. It's still a data class. Premium account has an extra attribute. But actually all of this works as you would expect. So if I create let's say let me remove this a free account. This is a free account. And now what I can do is set the owner and the base fee which it inherits from the account abstract base class. So I can set owner to and I can set the base fee to let's say 10. Now base fee doesn't matter because in case of free account the monthly fee is always zero but you can actually still set it. And then if I print the free account then you see that it actually does what we expect it to do. So I'm the owner and the base V is 10. If we create a premium account, I can use the premium account class. And now you see we have three arguments. The owner, the base fee, and extra storage, which was added as an extra argument here. So let's create an owner. We set the base fee to let's say 20 and extra storage is 50 like so. And then let's also print the premium account. There we go. And as you can see this works as expected. And then if I print the monthly fee we can also do that like so. monthly fee is a property. So let's print that. That gives us 25 and that's also what we expect since the base fee was 20 and the extra storage is times 0. 1. So that comes to 25. So if you're using abstract base class, this actually works really well together with data classes. This pattern is surprisingly expressive and maybe you didn't know that data class and abstract base classes work so well together. Now, of course, you have to be careful with inheritance in design. Don't create deep inheritance hierarchies with lots of dependencies because uh inheritance is one of the strongest types of coupling that there is. But if you need it, it works really well with data classes.

Final Thoughts

Now, I hope this video showed you that even though data classes seem simple at first, they hide a surprising amount of expressive power. Let me know in the comments which of these features did you not know about. To be totally honest, I don't use data classes all that often in my Python code except for one specific case. If you want to learn exactly when that is, when data classes truly make sense, watch this video next. Thanks for watching and see you next

Другие видео автора — ArjanCodes

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник