Published Jan 18, 2016
A class is an abstraction for a user-defined datatype with its own internal values (fields) and behaviors (methods). Objects are instances of classes. All objects which are instances of the same class will have the same fields and methods (Python uses the term attribute to refer to fields and methods). In Python, individual objects can have additional attributes added to their internal namespaces, but all instances will have at least the attributes defined by the class.
Classes are created using the keyword class
followed by the class’s name and
body.
The class body is executed in its own namespace.
Any names defined in this namespace become attributes of the class.
Instances are created by calling the class as though it were a function.
When attempting to access a particular attribute of an object, the object’s namespace is searched for that attribute. If the attribute is not found, the namespaces of the object’s class and all superclasses are searched. The exact algorithm is described by the Method Resolution Order. A metaclass can define an mro method to override the default order.
Methods are implemented by functions in a class body.
The instance is passed as the first argument so the function can access the
instance’s namespace.
This can be done explicitly (by accessing the function through the class)
or implicitly (by accessing the function through the instance).
By convention, the first parameter of a method is called self
.
The special method __init__
is used to initialize the new instance.
The arguments to the class are passed to __init__
.
You may notice that I used two different techniques to initialize x
and y
.
The first example assigned x
and y
in the class body.
In the second example, x
and y
are passed to __init__
and bound to self
.
There is a subtle difference between the two: the first technique creates
class attributes while the second creates instance attributes.
Class attributes are attached to the class and therefore shared by all instances.
Instance attributes are unique to all instances.
In the first example, p.x
resolved to the instance attribute while p.y
resolved to the class attribute.
In the second example, p.x
and p.y
both resolved to instance attributes;
there are no x
or y
class attributes in Point
.
There are advantages and disadvantages to both.
If you expect most instances to use the default values, then class attributes
may be more memory-efficient.
However, if you want values to be supplied to the __init__
method, then it
doesn’t make sense to duplicate the assignment code.
Also, since class attributes are shared, if a class attribute is mutable (e.g. a
list
or dict
), then if any instance modifies it, then all instances will
see the modification. This may or may not be desired.
In addition to the initializer __init__
, there is also the constructor __new__
.
__new__
must create and return the new instance (it is an error for
__init__
to return anything other than None
).
The returned instance’s __init__
method is then called.
Normally, __new__
is only defined when extending immutable types, because
__init__
must modify the instance, which is only possible for mutable types.
In most cases, just defining __init__
is sufficient.
See the documentation for
new and
init.
If a class defines an attribute __slots__
, it must be a sequence of (str
)
names which will be defined in an instance’s namespace.
Instances of a class with __slots__
will reserve only enough memory for the
names defined in __slots__
.
This can reduce the memory usage for a class which has a lot of instances.
It can also prevent additional names from being added to an instance’s namespace.
To allow an instance to have additional names bound to its namespace, include
"__dict__"
in __slots__
.
Class attributes cannot be used to initialize the names defined in __slots__
.
When extending a class that defines __slots__
, the subclass’s __slots__
must
only declare additional slots.
Using the same name in the subclass’s __slots__
and the base class’s
__slots__
can break the program.
See slots.
Most object-oriented languages support a notion of public
and private
members.
Public members are available to all code that uses the instance, while private
members are visible only to the instance’s methods.
Python does not support this distinction.
By convention, any attribute whose name starts with _
is considered private,
but the interpreter does not enforce this.
When a name inside a class definition starts with at least two underscores and does not end with at least two underscores (i.e. it ends with no more than one underscore), then that name undergoes name mangling. The class name with one leading underscore is prepended to the name. This is to prevent “private” names in a subclass from overriding “private” names in a superclass, which may break the superclass.
Subclasses are created by placing the names of superclasses in parentheses after
the class name.
Subclasses can override attributes defined in base classes.
Overridden methods can access the base class implementation using
super.
See guide to using super().
Note that you must explicitly call super().__init__()
in your __init__
method; this is not done automatically.
Python supports multiple inheritance, where a class can have multiple base classes. This is commonly used to support mix-in classes.
Most operators in Python are implemented by methods with
special names.
The most obvious when running Python in interactive mode is converting an object
to its string representation. This is done using __repr__
.
(There is also a __str__
method. Its default is to call __repr__
.
The difference is that the result of __repr__
should look like a call to the
class’s constructor, while the result of __str__
should just look informative.
See
repr
and
str.)
The following methods implement the comparison operators:
__lt__
: <
__le__
: <=
__eq__
: ==
__ne__
: !=
__gt__
: >
__ge__
: >=
These methods should return True
, False
or NotImplemented
.
The latter indicates that the class does not know how to compare the arguments.
If one of these methods returns NotImplemented
, then Python will swap the
operands and call the complementary method on the other object
(e.g. x == y
becomes either x.__eq__(y)
or y.__eq__(x)
).
For immutable types, if you implement __eq__
you should also implement
__hash__
.
See hash
and my section on
implementing hashable types.
To implement the arithmetic operators, see the section emulating numeric types.
When called, the operator methods must return either the result of the operation
or NotImplemented
.
If the method returns NotImplemented
, then the other operand’s reflected
operator method.
For example, to implement x + y
, either x
’s class must define __add__
to
accept instances of y
’s class, or y
’s class must define __radd__
to accept
instances of x
’s class.
There are also inline versions of the operators.
These operators should attempt to perform the modification in-place (i.e. modify
self
) and return the result.
The result is bound to the name to the left of the operator (e.g. x += y
becomes x = x.__iadd__(y)
).
There are no reflected versions of these methods (it wouldn’t make sense).
To implement a container, see emulating container types.
The len
function is implemented by the __len__
method.
The index operator is implemented by the methods __getitem__
, __setitem__
and __delitem__
.
for
loops expect __getitem__
to raise an IndexError
to detect the end of
the loop, unless you define your own iterator using __iter__
.
__iter__
and __reversed__
are used to implement iterators.
They must create and return an iterator object.
A common shortcut is to implement them as generators.
The call operator is defined by __call__
.
Arguments to the call are arguments to __call__
.
See the documentation.
A decorator is a special type of callable which accepts a callable as a parameter and returns a callable. Python has special syntax for decorators so that they can be applied when the callable is defined.
Decorators can be used to register a function to be called at some other time
by other code.
For example, the atexit module
contains a register
decorator which registers the function to be executed when
the interpreter shuts down.
A decorator can be any callable object.
Since classes are callable (calling them creates an instance), a class which
defines __call__
can also be used as a decorator.
Classes themselves can also be decorated.
There is a list of decorator recipes in the Python Decorator Library.
Static methods do not receive an instance as a parameter.
A class with static methods serves as a namespace for the methods.
Static methods can also serve as alternative constructors with more descriptive
names.
A static method is created with the @staticmethod
decorator.
A class method receives a class as its first parameter, instead of an instance.
These can be more flexible than static methods, especially with inheritance.
Class methods are created with the @classmethod
decorator.
The following creates a base class and registry for website parsers.
Subclasses must either define class attributes NETLOC_PATTERN
and
PATH_PATTERN
or override the class method can_handle_url
.
Users can get a parser for a certain website by passing the URL to
ParserFactory.get_parser(url)
.
Properties are attributes which are accessed like fields but behave like methods. Properties are created with the property decorator.
Descriptors are the magic behind Python’s methods and properties.
A descriptor
is defined by any of the methods __get__
, __set__
, and __delete__
.
When a class attribute defines the __get__
method, it is called and the result
is returned as the result of the attribute lookup.
Functions are descriptors.
When a function is an attribute of a class, its __get__
method either returns
itself (when accessed via the class) or a bound method object (when accessed
via an instance).
The bound method prepends the instance to the argument list, thus the instance
becomes the function’s self
parameter.
More information about this process is described in From Function to Method. A full discussion can be found in the Descriptor HowTo Guide.
Just like objects are instances of classes, classes are instances of metaclasses. When a class specifies a metaclass (or inherits from a class which specifies a metaclass), the following occurs:
__prepare__
class method is called with arguments name
,
bases
, and **kwds
.
name
is the name of the new classbases
is a tuple
of the base classes**kwds
is any keyword arguments specified after the base classes.It must return a dict
representing the class’s namespace
name
is the name of the new classbases
is a tuple
of the base classesnamespace
is a dict
containing the updated namespace.**kwds
is any keyword arguments specified after the base classes.Note that most examples of metaclasses use __new__
instead of __init__
.
This is not required.
In fact, it
can be more useful
to use __init__
.
Since classes are instances of metaclasses, any methods defined in the metaclass
will be inherited by the class.
In particular, the metaclass can implement operator methods, allowing operators
to be used on class objects.
For example, if a metaclass defines __call__
, then attempting to instantiate
a class which uses this metaclass will actually call the __call__
method.
The enum
module in the standard library uses this in the EnumMeta
metaclass.
The EnumMeta
metaclass overrides several operator methods, such as __call__
,
__iter__
, and __getitem__
.
Enum
is an instance of EnumMeta
.
All subclasses of Enum
will inherit these operators, allowing for the syntax
shown in the examples.