An Introduction to Python 3.7 Data Classes

Sometimes it’s fun to just nerd out for a moment. Python is one of the most popular scripting languages today and we love it too. That’s one reason why LogicMonitor can execute any script or programming language supported by your environment. That means you can enjoy the latest features of your favorite languages in your favorite monitoring tool (right?). If Python is your vice, give Python 3.7’s Data Classes a try!

Understanding The Past to Appreciate the Present

Groovy, an extension of Java, and PowerShell are two languages that I believe strongly support OOP in a user-friendly way. They let you quickly manipulate custom objects and report their values. Python, while wildly popular and powerful, lagged a bit behind. To see some past examples, search the internet for “Python print class attributes”. At the time of this article, here are some of the top search results from one of the most popular developer resources, Stack Overflow:

It’s not much better if you try to search the official Python 2.7 documentation for classes. Notice every time their examples print information about the classes they assume you already know exactly what you want to show, or, that you’re willing to spend the time explicitly printing each class attribute?

Based on the search results and answers, as well as my own experiences, I saw two main challenges for working with custom Python objects:

  1. There was no well-established, baseline method of printing class attributes. This led to users creating their own solutions, which were often incomplete, brittle, slow, or convoluted. Some of the accepted Stack Overflow answers can still leave you scratching your head, even if you’re not new to Python.
  2. Effective and clean solutions for printing class attributes, as seen in Stack Overflow answers, depended on knowledge of unnecessarily granular or specific Python features such as vars, dir, an object’s internal dictionary, string formatting, or list comprehensions. While all of those things are worth learning, especially string formatting and list comprehensions, none of those should be required in order to print an object class’ attributes. Users are accustomed to using the simple syntax print some_variable in order to display an item’s value and it should be no different for classes.

For those who are less script savvy, let’s use an analogy to explain the problem. Let’s picture a Python class object as a physical newspaper full of stories. You’re the one who printed this newspaper. You already know the stories. You carefully placed each paragraph, story, and headline. In previous Python versions, when you handed the newspaper to Python and, as a sanity check, asked it to read you one of the news reports, it made you specify each paragraph you wanted read instead of just giving you the whole story. Cute when your toddler says that; frustrating and concerning when your 30 year old aspiring journalist says that.

Python Starts Adulting

Late Blooming is Better than Never Blooming

In 2017, PEP 557 conceptualized Data Classes, which “exist primarily to store values which are accessible by attribute lookup”, and nearly a year later, the release of Python 3.7.0 officially offered an implementation. Data Classes piggyback off of type hinting (PEP 484) and type annotations (PEP 526), respectively introduced in Python 3.5 and 3.6. When it comes to working with objects, people expect to be able to consistently predict data types. While Python still isn’t a strictly typed language, the combination of type hinting and annotations helps both users and IDEs know what data types to expect when working through a larger codebase.

Laziness Drives Automation

Another part of maturing is being honest about what you care about and how to spend more time focused on that. It’s possible that it’s just me, but I feel like DataClasses were undersold, or, someone really enjoys dealing with tedious processes. Exact words from PEP 557 regarding the Data Class are “[…]there’s really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given. […] Data Classes save you from writing and maintaining these methods.”.

What are these “generated methods”? Fortunately the abstract of PEP 557 gives an example. Let’s say you had a Python class defined as follows:

@dataclass
class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0
    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

The Data Class decorator (“@dataclass”), therefore, would automatically generate the following code for you:

def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0) -> None:
    self.name = name
    self.unit_price = unit_price
    self.quantity_on_hand = quantity_on_hand
def __repr__(self):
    return f'InventoryItem(name={self.name!r}, unit_price={self.unit_price!r}, quantity_on_hand={self.quantity_on_hand!r})'
def __eq__(self, other):
    if other.__class__ is self.__class__:
        return (self.name, self.unit_price, self.quantity_on_hand) == (other.name, other.unit_price, other.quantity_on_hand)
    return NotImplemented
def __ne__(self, other):
    if other.__class__ is self.__class__:
        return (self.name, self.unit_price, self.quantity_on_hand) != (other.name, other.unit_price, other.quantity_on_hand)
    return NotImplemented
def __lt__(self, other):
    if other.__class__ is self.__class__:
        return (self.name, self.unit_price, self.quantity_on_hand) < (other.name, other.unit_price, other.quantity_on_hand)
    return NotImplemented
def __le__(self, other):
    if other.__class__ is self.__class__:
        return (self.name, self.unit_price, self.quantity_on_hand) <= (other.name, other.unit_price, other.quantity_on_hand)
    return NotImplemented
def __gt__(self, other):
    if other.__class__ is self.__class__:
        return (self.name, self.unit_price, self.quantity_on_hand) > (other.name, other.unit_price, other.quantity_on_hand)
    return NotImplemented
def __ge__(self, other):
    if other.__class__ is self.__class__:
        return (self.name, self.unit_price, self.quantity_on_hand) >= (other.name, other.unit_price, other.quantity_on_hand)
    return NotImplemented

Those thirty lines of automatically generated code were for only three attributes: name, unit_price, and quantity_on_hand. The reason why I believe this is significantly undersold is because the level of effort to create and maintain those methods yourself becomes incredibly cumbersome as you add more class attributes. Who feels like dealing with that? Well, maybe Python contributor Eric V. Smith does but partially thanks to him, you don’t have to!

Data Class Benefits

We can see a number of Data Class benefits based on the information we have thus far:

  1. Improved code legibility through more concise, declarative class format
  2. Fast, automatic setup or teardown of “dunder” (Double Underscore) methods
    1. Initialization: saves the values you give the class when you instantiate it
    2. Representation: what you use to print a string description of the class and its attributes
    3. Equal To Comparison
    4. Not Equal To Comparison
    5. Less Than Comparison
    6. Less Than or Equal To Comparison
    7. Greater Than Comparison
    8. Greater Than or Equal To Comparison
    9. Frozen (immutable): what allows you to set and get attributes. With a “frozen” class, attributes can only be given initial values. Those values cannot be set again after initialization.
  3. Quick, automated filtering of attributes included in string representation. This deserves its own bullet point from a maintainability perspective. As you declare attributes/fields for the class, you can quickly specify whether it should be included when you print class details. While this doesn’t offer any significant security benefit, it does help you reduce visual clutter without sacrificing your ability to save or process data.
  4. Type hinting and annotations help you and your IDE track what methods and properties apply to your variables even as you start to add layers of abstraction. This is especially helpful if you’re human like the rest of us and use code autocompletion.
  5. Standard, concise way of printing a class and its attributes using the print statement:
    print some_variable

Data Class Caveats

Per PEP 557, Data Classes are not meant to be a full replacement of any other Python library. Furthermore, the PEP article states that Data Classes are not appropriate where:

  • API compatibility with tuples or dicts is required.
  • Type validation beyond that provided by PEP 484 and 526 is required, or value validation or conversion is required.

Conclusion

Python Data Classes combine existing Python features into a concise, declarative syntax in order to bring users an improved experience when managing classes used primarily for data storage. One of the key benefits of Python Data Classes is that you can quickly print string representations of classes and their attributes. Because of LogicMonitor’s powerful extensibility, you can leverage Python Data Classes among all the other newest Python features in your monitoring and portal administration.