Objects and classes in Python¶
Contents:
Decorators¶
This section cover the decorator syntax and the concept of a decorator (or decorating) callable.
Decorators are a syntactic convenience, that allows a Python source file to say what it is going to do with the result of a function or a class statement before rather than after the statement. Decorators on function statements have been available since Python 2.4, and on class statements since Python 2.6.
In this section we describe the decorator syntax and give examples of its use. In addition, we will discuss functions (and other callables) that are specifically designed for use as decorators. They are also called decorators.
You can, and in medium sized or larger projects probably should, write your own decorators. The decorator code might, unfortunately, be a little complex. But it can greatly simplify the other code.
The decorator syntax¶
The decorator syntax uses the @ character. For function statements the following are equivalent:
# State, before defining f, that a_decorator will be applied to it.
@a_decorator
def f(...):
...
def f(...):
...
# After defining f, apply a_decorator to it.
f = a_decorator(f)
The benefits of using the decorator syntax are:
- The name of the function appears only once in the source file.
- The reader knows, before the possibly quite long definition of the function, that the decorator function will be applied to it.
The decorator syntax for a class statement is same, except of course that it applies to a class statement.
Bound methods¶
Unless you tell it not to, Python will create what is called a bound method when a function is an attribute of a class and you access it via an instance of a class. This may sound complicated but it does exactly what you want.
>>> class A(object):
... def method(*argv):
... return argv
>>> a = A()
>>> a.method
<bound method A.method of <A object at 0x...>>
When we call the bound method the object a is passed as an argument.
>>> a.method('an arg')
(<A object at 0x...>, 'an arg')
>>> a.method('an arg')[0] is a
True
staticmethod()
¶
A static method is a way of suppressing the creation of a bound method when accessing a function.
>>> class A(object):
... @staticmethod
... def method(*argv):
... return argv
>>> a = A()
>>> a.method
<function method at 0x...>
When we call a static method we don’t get any additional arguments.
>>> a.method('an arg')
('an arg',)
classmethod()
¶
A class method is like a bound method except that the class of the instance is passed as an argument rather than the instance itself.
>>> class A(object):
... @classmethod
... def method(*argv):
... return argv
>>> a = A()
>>> a.method
<bound method type.method of <class 'A'>>
When we call a class method the class of the instance is passed as an additional argument.
>>> a.method('an arg')
(<class 'A'>, 'an arg')
>>> a.method('an arg')[0] is A
True
In addition, class methods can be called on the class itself.
>>> A.method('an arg')
(<class 'A'>, 'an arg')
The call()
decorator¶
Suppose we want to construct a lookup table, say containing the squares of positive integers for 0 to n.
For n small we can do it by hand:
>>> table = [0, 1, 4, 9, 16]
>>> len(table), table[3]
(5, 9)
Because the formula is simple, we could also use a list comprehension:
>>> table = [i * i for i in range(5)]
>>> len(table), table[3]
(5, 9)
Here’s another way, that uses a helper function (which we will call table). For a table of squares list comprehension is better, because we can write an expression that squares. But for some tables a complex sequence of statements is required.
>>> def table(n):
... value = []
... for i in range(n):
... value.append(i*i)
... return value
>>> table = table(5)
We call the helper function table for three related reasons
- It indicates the purpose of the function.
- It ensures that the helper function is removed from the namespace once the table has been constructed.
- It conforms to the decorator syntax.
As before, we test the table and find that it works. >>> len(table), table[3] (5, 9)
>>> def call(*argv, **kwargs):
... def call_fn(fn):
... return fn(*argv, **kwargs)
... return call_fn
>>> @call(5)
... def table(n):
... value = []
... for i in range(n):
... value.append(i*i)
... return value
>>> len(table), table[3]
(5, 9)
Nesting decorators¶
The decorator syntax can be nested. The following example is similar to the list comprehension approach, except that it uses a generator function rather than a generator expression.
>>> @list
... @call(5)
... def table(n):
... for i in range(n):
... yield i * i
We read this as saying:
The value of table is the list obtained by iterating over the function evaluated at n equal to 5.
The purpose of this example is illustrate some of the concepts. We are not saying that it is, or is not good programming practice. That will depend, in part, on the context.
As before, we test the table and find that it works.
>>> len(table), table[3]
(5, 9)
Class decorators before Python 2.6¶
Prior to Python 2.6 one could not write
@a_decorator
class MyClass(...):
# possibly many lines of code.
If you need to support earlier versions of Python, I recommend that you develop in Python 2.6 or later. This allows your mind and keyboarding to use decorators. Once the decorating code is stable refactor it to support earlier versions of Python, as follows.
# @a_decorator
class MyClass(...):
# possibly many lines of code.
MyClass = a_decorator(MyClass) # if changed, change decorator comment.
This approach allows you to think and largely code using the class decorator point of view, at the cost of having to keep the decorator comment up to date when the decorator changes.
Constructing classes¶
There are two basic ways of constructing classes in Python. The best
known way is to use Python’s class
statement. The other
way is to use Python’s type()
function. This page covers the
statement way. type(name, bases, dict) is more powerful and
sometimes more convenient. However, the statement approach is the
better way to get started and in ordinary programming is the most
convenient.
The empty class¶
We will start with the empty class, which is not as empty as it looks.
>>> class A(object):
... pass
Like most Python objects, our empty class has a dictionary. The dictionary holds the attributes of the object.
>>> A.__dict__
<dictproxy object at 0x...>
Even though our class is empty, its dictionary (or more exactly dictproxy) is not.
>>> sorted(A.__dict__.keys())
['__dict__', '__doc__', '__module__', '__weakref__']
Attributes __doc__ and __module__ are there for documentation, and to give better error messages in tracebacks. The other attributes are there for system purposes.
In addition, our class two attributes that are not even listed in the dictionary. The __bases__ attribute is the list of base classes provided in the original class statement.
>>> A.__bases__
(<type 'object'>,)
The method resolution order (mro) attribute __mro__ is computed from the bases of the class. It provides support for multiple inheritance.
>>> A.__mro__
(<class 'A'>, <type 'object'>)
For now the important thing is that even the empty class has attributes. (For IronPython and Jython the attributes are slightly different.)
dict_from_class()
¶
In this section we define a functions that gets a dictionary from a class. This dictionary contains all the information supplied in the body of a class statement, except for the doc-string.
The __dict__ of the empty class¶
Here’s our empty class again:
>>> class A(object):
... pass
As seen in Constructing classes, even for the empty class its class dictionary has entries. Handling these always-there entries is a nuisance when deconstructing classes. Here, once again, is the list of entries.
>>> sorted(A.__dict__.keys())
['__dict__', '__doc__', '__module__', '__weakref__']
The __dict__ and __weakref__ entries are there purely for system purposes. This makes them easier to deal with.
The class docstring __doc__ is None unless the user supplies a value.
>>> A.__doc__ is None
True
>>> class A2(object):
... 'This is the docstring'
>>> A2.__doc__
'This is the docstring'
Ordinarily, __module__ is the name of the module in which the class is defined. However, because of the way Sphinx uses doctest, it gets the name of the module wrong. Please don’t worry about this. Despite what it says below, it’s the name of the module.
>>> A.__module__
'__builtin__'
Is the doc-string part of the body?¶
Soon we will define a function that copies the body, as a dictionary, out of a class. But first we must answer the question: Is the doc-string part of the body of a class?
There is no completely satisfactory answer to this question, as there are good arguments on both sides. We choose NO, because for example using the -OO command line option will remove doc-strings, and so they are not an essential part of the body of the class. (However, -OO does not remove doc-strings produced explicitly, by assigning to __doc__.)
The keys to be excluded are precisely the ones that the empty class (which has an empty body) has.
>>> _excluded_keys = set(A.__dict__.keys())
Definition of dict_from_class()
¶
This function simply filters the class dictionary, copying only the items whose key is not excluded.
>>> def dict_from_class(cls):
... return dict(
... (key, value)
... for (key, value) in cls.__dict__.items()
... if key not in _excluded_keys
... )
As expected, the empty class has an empty body.
>>> dict_from_class(A)
{}
Here’s a class whose body is not empty.
>>> class B(object):
... 'This docstring is not part of the body.'''
... s = 'a string'
... def f(self): pass
We get what we expect for the body. (See [somewhere] for why we need the __func__.)
>>> dict_from_class(B) == dict(s='a string', f=B.f.__func__)
True
Here’s another way of expressing the same truth.
>>> sorted(dict_from_class(B).items())
[('f', <function f at 0x...>), ('s', 'a string')]
property_from_class()
¶
This section shows how using a class decorator, based upon dict_from_class(), can make it much easier to define complex properties. But first we review properties.
About properties¶
The property()
type is a way of ‘owning the dot’ so that
attribute getting, setting and deletion calls specified functions.
One adds a property to a class by adding to its a body a line such as the following, but with suitable functions for some or all of fget, fset and fdel. One can also specify doc to give the property a doc-string.
attrib = property(fget=None, fset=None, fdel=None, doc=None)
If all one wants is to specify fset (which is a common case) you can use property as a decorator. This works because fget is the first argument.
For example, to make the area of a rectangle a read-only property you could write:
@property
def attrib(self):
return self.width * self.length
Suppose now you have a property that you wish to both get and set. Here’s the syntax we’d like to use.
@property_from_class
class attrib(object):
'''Doc-string for property.'''
def fget(self):
'''Code to get attribute goes here.'''
def fset(self):
'''Code to set attribute goes here.'''
We will now construct such a decorator.
Definition of property_from_class()
¶
This function, designed to be used as a decorator, is applied to a
class and returns a property. Notice how we pick up the doc-string as
a separate parameter. We don’t have to check for unwanted keys in the
class dictionary - property()
will do that for us.
>>> def property_from_class(cls):
...
... return property(doc=cls.__doc__, **dict_from_class(cls))
Using property_from_class()
¶
Here is an example of its use. We add a property called value, which stores its data in _value (which by Python convention is private). In this example, we validate the data before it is stored (to ensure that it is an integer).
>>> class B(object):
... def __init__(self):
... self._value = 0
...
... @property_from_class
... class value(object):
... '''The value must be an integer.'''
... def fget(self):
... return self._value
... def fset(self, value):
... # Ensure that value to be stored is an int.
... assert isinstance(value, int), repr(value)
... self._value = value
Here we show that B
has the required properties.
>>> b = B()
>>> b.value
0
>>> b.value = 3
>>> b.value
3
>>> B.value.__doc__
'The value must be an integer.'
>>> b.value = 'a string'
Traceback (most recent call last):
AssertionError: 'a string'
Unwanted keys¶
If the class body contains a key that property does not accept we for no extra work get an exception (which admittedly could be a clearer).
>>> @property_from_class
... class value(object):
... def get(self):
... return self._value
Traceback (most recent call last):
TypeError: 'get' is an invalid keyword argument for this function
Deconstructing classes¶
In Constructing classes we saw how to construct a class (by using the class keyword). In this section we see how to reverse the process.
To use the class keyword you have to specify:
- A name for your class.
- A tuple of bases.
- A class body.
In this section we see how to get this information back again. Let’s do the easy stuff first. Here’s our empty class again:
>>> class A(object):
... pass
Here’s how to get the name of the class:
>>> A.__name__
'A'
And here’s how to get the bases:
>>> A.__bases__
(<type 'object'>,)
[To be continued.]
type(name, bases, dict)¶
According to its docstring, there are two ways to call the
type()
builtin.
>>> print type.__doc__
type(object) -> the object's type
type(name, bases, dict) -> a new type
In this section we explore how to use type()
to construct new
classes.
Constructing the empty class¶
As usual, we start with the empty class. The __name__ attribute of the class need not be the same as the name of the variable in which we store the class. When at top-level (in the module context) the class command binds the class to the module object, using the name of the class as the key.
When we use type, there is no link between the __name__ and the binding.
>>> cls = type('A', (object,), {})
The new class has the name we expect.
>>> cls.__name__
'A'
Its docstring is empty.
>>> cls.__doc__ is None
True
It does not have a __module__ attribute, which is surprising.
>>> cls.__module__
Traceback (most recent call last):
AttributeError: __module__
This class does not have a __module__ attribute because to the things that Sphinx does when running the doctest. Ordinarily, the class will have a __module__ attribute.
>>> sorted(cls.__dict__.keys())
['__dict__', '__doc__', '__weakref__']
The lack of a __module__ attribute explains the string representation of the class.
>>> cls
<class 'A'>
Constructing any class¶
We obtained the empty class, whose __dict__ has only the system keys,
by passing the empty dictionary to type()
. We obtain more
interesting classes by passing a non-empty dictionary. We can at the
same time pass more interesting bases, in order to achieve
inheritance.
Specifying __doc__, __name__ and __module__¶
Let’s try to use the dict argument to specify these special attributes.
>>> body = dict(__doc__='docstring', __name__='not_A', __module__='modname')
>>> cls2 = type('A', (object,), body)
We have set the __docstring__ and __module__ attributes, but the __name__ is still A.
>>> cls2.__doc__, cls2.__name__, cls2.__module__
('docstring', 'A', 'modname')
Subclassing int
¶
We subclass in order to create a new class whose behaviour is inherited from the base classes, except that the new class can also override and add behaviour. Object creation is behaviour. For most classes it is enough to provide a different __init__ method, but for immutable classes one often have to provide a different __new__ method.
In this section we explain why __new__ is needed, and give examples of its use. But first we review mutation.
Mutable and immutable types¶
Some objects in Python, such as dictionaries and lists, can be
changed. We can change these objects after they have been made. This
is called mutation. The types dict
and list
are
called mutable types.
>>> x = []
>>> x.append(1)
>>> x
[1]
Some other objects, such as strings and tuples, cannot be changed. Once made they cannot be changed. They are called immutable types.
>>> y = 'abc'
>>> y[0] = 'A'
Traceback (most recent call last):
TypeError: 'str' object does not support item assignment
Enumerated integers and named tuples¶
We will use enumerated integers as an example in this section. In Python, booleans are an example of an enumerated integer type.
However, our task in this section is not to use booleans but to understand them. This will allow us to create our own subclasses of int and of immutable types.
The bool
type¶
Here we review the bool
type in Python.
Comparisons return a boolean, which is either True or False.
>>> 1 < 2, 1 == 2
(True, False)
True and False are instance of the bool
type.
>>> type(True), type(False)
(<type ‘bool’>, <type ‘bool’>)
The bool
type inherits from int
.
>>> bool.__bases__
(<type 'int'>,)
Because True and False are (in the sense of inherit from) integers, we can do arithmetic on them.
>>> True + True
2
>>> False * 10
0
We can even use boolean expressions as numbers (although doing so might result in obscure code).
>>> a = 3; b = 4
>>> (a < b) * 10 + (a == b) * 20
10
Emulating bool
- the easy part¶
In this subsection, as preparation for enumerated integers, we will
start to code a subclass of int
that behave like
bool
. We will start with string representation, which is
fairly easy.
>>> class MyBool(int):
... def __repr__(self):
... return 'MyBool.' + ['False', 'True'][self]
This give us the correct string representations. >>> f = MyBool(0) >>> f MyBool.False
>>> t = MyBool(1)
>>> t
MyBool.True
But compare
>>> bool(2) == 1
True
with
>>> MyBool(2) == 1
False
In fact we have
>>> MyBool(2) == 2
True
>>> MyBool(2)
Traceback (most recent call last):
IndexError: list index out of range
Emulating bool
- what goes wrong¶
In many classes we use __init__ to mutate the newly constructed
object, typically by storing or otherwise using the arguments to
__init__. But we can’t do this with a subclass of int
(or
any other immuatable) because they are immutable.
You might try
>>> class InitBool(int):
... def __init__(self, value):
... self = bool(value)
but it won’t work. Look at this - nothing has changed.
>>> x = InitBool(2)
>>> x == 2
True
This line of code
self = bool(value)
is deceptive. It does change the value bound to the self in __init__, but it does not change the object that was passed to __init__.
You might also try
>>> class InitBool2(int):
... def __init__(self, value):
... return bool(value)
but when called it raises an exception
>>> x = InitBool2(2)
Traceback (most recent call last):
TypeError: __init__() should return None, not 'bool'
Emulating bool
- using __new__¶
The solution to the problem is to use __new__. Here we will show that it works, and later we will explain elsewhere exactly what happens. [where?].
>>> class NewBool(int):
... def __new__(cls, value):
... return int.__new__(cls, bool(value))
This works - no exception and 2 is converted into 1.
>>> y = NewBool(2)
>>> y == 1
True
We’ll go carefully through this definition of __new__.
1. We define __new__, which like __init__ has a special role in object creation. But it’s role is to do with creation of a new object, and not the initialisation of an already created object.
2. The function __new__ has two parameters. The first parameter is a class. The way we’ve called it, it will be the NewBool class.
The function __new__ returns a value.
The value returned is
int.__new__(cls, bool(value))
Understanding int.__new__¶
Here’s the docstring for _int.__new__.
>>> print int.__new__.__doc__
T.__new__(S, ...) -> a new object with type S, a subtype of T
Let’s try it, with S and T equal.
>>> z = int.__new__(int, 5) # (*)
>>> z == 5
True
>>> type(z)
<type 'int'>
Thus, we see that line (*) is very much like or perhaps the same as int(5). Let’s try another example.
>>> int('10')
10
>>> int.__new__(int, '21')
21
The docstring above says that S must be a subtype of T. So let’s create one.
>>> class SubInt(int): pass
And now let’s use it as an argument to int.__new__.
>>> subint = int.__new__(SubInt, 11)
Now let’s test the object we’ve just created. We expect it to be an instance of SubInt, and to be equal to 11.
>>> subint == 11
True
>>> type(subint)
<class 'SubInt'>
There we have it. Success. All that’s required to complete the
emulation of bool
is to put all the pieces together.
Note
The key to subclassing immutable types is to use __new__ for both object creation and initialisation.
Exercise Create a class EmulBool that behaves like the
bool
builtin.
Exercise (Hard). Parameterize EmulBool. In other words, create an EnumInt such that
X = EnumInt(['False', 'True'])
creates a class X that behave like EmulBool.
Subclassing tuple
¶
Recall that with EmulBool in Subclassing int we had to define a __new__ method because we need to adjust the values passed to EmulBool before the instance was created.
The desired properties of Point¶
Since Python 2.6, namedtuple
has been part of the
collections
module. We can use it to provide an example of
what is required.
>>> from collections import namedtuple
>>> Point = namedtuple('Point', ('x', 'y'))
Here are some facts about the Point class.
Point is a subclass of tuple.
>>> Point.__bases__ (<type 'tuple'>,)
Two arguments are used to initialise a point.
>>> p = Point(1, 2)
A point has items at 0 and at 1.
>>> p[0], p[1] (1, 2)
We can access these items using the names x and y.
>>> p.x, p.y (1, 2)
Exercise Write an implementation of Point, that satisfies the above. (Please use the hints - they are there to help you.)
Hint To pass 1, 2 and 3 only three lines of code are required.
Hint To pass 4 use property, which replaces getting an attribute by a function call.
Hint The elegant way to pass 4 is to use
operator.itemgetter()
. Use this, and you’ll need only
another 3 lines of code in order to pass 4.
Answer¶
Point is a subclass of tuple.
>>> class Point(tuple): ... def __new__(self, x, y): ... return tuple.__new__(Point, (x, y))
>>> Point.__bases__ (<type 'tuple'>,)
Two arguments are used to initialise a point.
>>> p = Point(1, 2)
A point has items at 0 and at 1.
>>> p[0], p[1] (1, 2)
We can access these items using the names x and y.
>>> import operator
>>> Point.x = property(operator.itemgetter(0)) >>> Point.y = property(operator.itemgetter(1))
>>> p.x, p.y (1, 2)
What happens when you call a class?¶
In this section we describe, in some detail, what happens when you call a class.
Creation and initialisation¶
Recall that every object has a type (sometimes known as a class).
>>> type(None), type(12), type(3.14), type([])
(<type 'NoneType'>, <type 'int'>, <type 'float'>, <type 'list'>)
The result of calling a class C is, ordinarily, an initialised object whose type is C. In Python this process is done by two functions
- __new__ returns an object that has the right type
- __init__ initialises the object created by __new__
To explain we will do the two steps one at a time. This will also clarify some details. But before we begin, we need a simple class.
>>> class A(object):
... def __init__(self, arg):
... self.arg = arg
We will explain what happens when Python executes the following.
a = A('an arg')
First, Python creates an object that has the right type. (The temporary tmp is introduced just to explain what happens. Python stores its value at a nameless location.)
>>> tmp = A.__new__(A, 'an arg')
>>> type(tmp)
<class 'A'>
But it has not been initialised.
>>> tmp.arg
Traceback (most recent call last):
AttributeError: 'A' object has no attribute 'arg'
Second, Python runs our initialisation code.
>>> tmp.__init__('an arg')
>>> tmp.arg
'an arg'
Finally, Python stores the value at a.
>>> a = tmp
The default __new__¶
We did not define a __new__ method for our class A, but all the same Python was able to call A.__new__. How is this possible?
For an instance of a class C, getting an attribute proceeds via the method resolution order of C. Something similar, but with important differences, happens when getting an attribute from C itself (rather than just an instance).
Here’s proof that A.__new__ and object.__new__ are the same object. We show this in two different, but equivalant, ways.
>>> A.__new__ is object.__new__
True
>>> id(A.__new__) == id(object.__new__)
True
This explains how it is that Python can call A.__new__ even though we did not supply such a function ourselves.
For another example, we subclass int.
>>> class subint(int): pass
>>> subint.__new__ is int.__new__
True
Summary¶
Suppose C is a class. When you call, say
C(*argv, **kwargs)
the following happens.
C.__new__ is found.
The result of the following call is stored, say in tmp.
C.__new__(C, *argv, **kwargs)
tmp.__init__ is found.
The result of the following is return as the value of the class call.
self.__init__(*argv, **kwargs)
(Not discussed.) If tmp is not an instance of C (which includes subclasses of C) then steps 3 and 4 are omitted.
Metaclass¶
Every object has a type¶
In Python, every object has a type, and the type of an object is an instance of type.
>>> type(0)
<type 'int'>
>>> isinstance(type(0), type)
True
>>> class A(object): pass
>>> type(A)
<type 'type'>
>>> a = A()
>>> type(a)
<class 'A'>
Even type has a type which is an instance of type (although it’s a little silly).
>>> type(type)
<type 'type'>
>>> isinstance(type, type)
True
The metaclass of an object¶
The metaclass of an object is defined to be the type of its type.
>>> def metaclass(obj):
... return type(type(obj))
>>> metaclass(0)
<type 'type'>
>>> metaclass(metaclass)
<type 'type'>
It’s quite hard to create an object whose metaclass is not type.
A trivial non-type metaclass¶
In Python anything that is a type can be subclassed. So we can subclass type itself.
>>> class subtype(type): pass
We can now use subtype in pretty much the same way as type itself. In particular we can use it to construct an empty class.
>>> cls = subtype('name', (object,), {})
Let’s look at the type and metaclass of cls.
>>> type(cls), metaclass(cls)
(<class 'subtype'>, <type 'type'>)
Notice that type(cls) is not type. This is our way in. Here’s an instance of cls, followed by its type and metaclass.
>>> obj = cls()
>>> type(obj), metaclass(obj)
(<class 'name'>, <class 'subtype'>)
We have just constructed an object with a non-trivial metaclass. The metaclass of obj is subtype.
A non-trivial example¶
When Python executes
obj[key]
behind the scenes it executes
obj.__getitem__[key]
Here’s an example:
>>> class A(object):
... def __getitem__(self, key):
... return getattr(self, key)
>>> obj = A()
>>> obj['name']
Traceback (most recent call last):
AttributeError: 'A' object has no attribute 'name'
>>> obj.name = 'some value'
>>> obj['name']
'some value'
What’s the point?¶
There are two main reasons for introducing and using a metaclass, or in other words a subclass of type.
- We wish to create classes whose behaviour requires special methods or other properties on the type of the class. This sounds and is odd, but can useful. In JavaScript objects we use it to create an elegant and simple implementation in Python of JavaScript object semantics.
- We wish to make the class statement construct a class differently,
somewhat as
bool()
construct a number differently from anint()
. This is described in The __metaclass__ attribute, which is the next section.
The __metaclass__
attribute¶
The __metaclass__ attribute was introduced to give the programmer some control over the semantics of the class statement. In particular it eases the transition from old-style classes (which are not covered in this tutorial) and new-style classes (simply called classes in this tutorial).
Automatic subclassing of object¶
If at the told of a module you write:
__metaclass__ = type
then class statements of the form:
class MyClass: pass
will automatically be new-style. In other words, you don’t have to explicitly place object in the list of bases. (This behaviour is a consequence of the semantics of __metaclass__.)
Review of type(name, bases, body) and class statement¶
Recall that the type command, called like so
cls = type(name, bases, body)
constructs the class cls, as does the class statement
class cls(...): # body statements go here
The __metaclass__ attribute provides a link between these two ways of constructing classes.
The basic principle of the __metaclass__¶
Ordinarily, a class statement results in a call to type, with name, bases and body as arguments. However, this can be changed by
- Assigning __metaclass__ as an class body attribute.
- Assigning __metaclass__ as a module attribute.
- Placing a suitable class in the bases of the class statement.
Method (1) is used above, in Automatic subclassing of object. To explain (2) we will introduce a very silly example.
A very silly example¶
It’s not necessary for the __metaclass__ attribute to be type or a subclass of type. It could be any callable.
Here it is a function that returns a string.
>>> class very_silly(object):
... def __metaclass__(*argv):
... return 'This is very silly.'
The variable silly
bound by the class statement is a string.
In fact, it is the return value of the __metaclass__ attribute.
>>> very_silly
'This is very silly.'
A less silly example¶
Here’s a less silly example. We define the __metaclass__ to return the argument vector passed to it. This consists of name, bases and body.
>>> class silly(object):
... def __metaclass__(*argv):
... return argv
The variable silly is now bound to the value of argv. So it is a tuple of length 3, and it can be unpacked into name, bases and body.
>>> type(silly), len(silly)
(<type 'tuple'>, 3)
>>> name, bases, body = silly
The name, and bases are much as we expect them.
>>> name == 'silly', bases ==(object,)
(True, True)
The body has, as could be expected, a __metaclass__ key, which has the expected value.
>>> sorted(body.keys())
['__metaclass__', '__module__']
>>> silly[2]['__metaclass__']
<function __metaclass__ at 0x...>
A __metaclass__ gotcha¶
A class statement, if it does not raise an exception, assigns a value to a variable. Ordinarily, this value is a direct instance of type, namely
type(name, bases, body)
However, using __metaclass__ above allows the value assigned by a class statement to be any object whatsover. In the very silly example the value assigned by the class statement was a string. This is a violates the principle of least surprise, and that is the main reason why the example is very silly (and not that it does nothing useful).
With decorators, which are available on class statements since Python 2.6, the same effect as the silly example can be obtained without resort to complex magic.
A decorator example¶
Here we produce something similar to the silly example. First we define a decorator
>>> from jfine.classtools import dict_from_class
>>> def type_argv_from_class(cls):
... d = cls.__dict__
... name = cls.__name__
... body = dict_from_class(cls)
... bases = cls.__bases__
... return name, bases, body
Now we use the decorator. There is no magic. The class statement
produces a class, and the decorator function
type_args_from_class()
produces an argument vector from the
class.
>>> @type_argv_from_class
... class argv(object):
... key = 'a value'
When we unpack argv
we get what we expect.
>>> name, bases, body = argv
>>> name
'argv'
>>> bases
(<type 'object'>,)
>>> body
{'key': 'a value'}
Decorators versus __metaclass__¶
Whenever a __metaclass__ is used, one could also use a decorator to get effectively the same result. This section discusses this topic.
For an example we use the concept of a Bunch, as discussed in Alex Martelli’s excellent book Python in a Nutshell. As he says, a Bunch is similar to the struct type in C.
Bunch using decorators¶
Here we give a construction based on the decorator point of view. First we define a function, which can be used as a decorator, that returns a bunch class.
>>> def bunch_from_dict(a_dict, name='a_bunch'):
...
... __slots__ = sorted(a_dict.keys())
... defaults = dict(a_dict)
... bases = (BaseBunch,)
...
... def __init__(self, **kwargs):
... for d in defaults, kwargs:
... for key, value in d.items():
... setattr(self, key, value)
...
... body = dict(__slots__=__slots__, __init__=__init__)
... return type(name, bases, body)
We now need to implement the BaseBunch class, from which the return bunch classes will inherit __repr__ and, if we wish, other attributes.
>>> class BaseBunch(object):
... def __repr__(self):
... body = ', '.join([
... '%s=%r' % (key, getattr(self, key))
... for key in self.__slots__
... ])
... return '%s(%s)' % (self.__class__.__name__, body)
Here’s an example of the creation of a Point class.
>>> Point = bunch_from_dict(dict(x=0, y=0), 'Point')
And here are examples of its use.
>>> Point(x=1, y=3)
Point(x=1, y=3)
>>> Point()
Point(x=0, y=0)
We can also use bunch_from_dict as a decorator.
>>> from jfine.classtools import dict_from_class
>>> @bunch_from_dict
... @dict_from_class
... class RGB(object):
... 'This is a docstring.'
... red = green = blue = 0
We could, of course, introduce a new decorator
bunch_from_class()
to make life a little easier for the user.
Here’s an example of the use of the RGB class. It shows that the name of the class is not being properly picked up. This is an interface problem rather than a problem with the decorator approach. The name is available to be used, but the interface is not making it available. Similar remarks apply to the docstring.
>>> RGB(blue=45, green=150)
a_bunch(blue=45, green=150, red=0)
Bunch using __metaclass__¶
The code here is based on the __metaclass__ implementation of Bunch, given in Python in a Nutshell. The API is:
class Point(MetaBunch):
x = 0.0
y = 0.0
The base class MetaBunch()
is defined by:
class MetaBunch(object):
__metaclass__ = metaMetaBunch
The real work is done in
class metaMetaBunch(type):
def __new__(cls, name, bases, body):
# Creation of new_body similar to bunch_from_dict.
# ... but first need to 'clean up' the body.
new_body = ... # Computed from body
# Creation of new instance similar to bunch_from_dict.
# ... but here can't use type(name, bases, new_body)
return type.__new__(cls, name, bases, new_body)
where I’ve omitted the crucial code that computes the new_body from the old. (My focus here is on the logic of __metaclass_ and not the construction of the new body.)
How __metaclass__ works¶
In Python the class statement creates the class body from the code you have written, placing it in a dictionary. It also picks up the name and the bases in the first line of the class statement. These three arguments, (name, bases, body) are then passed to a function.
The __metaclass__ attribute is part of determining that function. If __metaclass__ is a key in the body dictionary then the value of that key is used. This value could be anything, although if not callable an exception will be raised.
In the example above, the MetaBunch class body has a key __metaclass__, and so its value metaMetaBunch is used. It is metaMetaBunch that is used to create the value that is stored at MetaBunch.
What is that value? When we instantiate metaMetaBunch we use its __new__ method to create the instance, which is an instance of type. In particular, the code that creates the new_body is run on the body of MetaBunch.
Now what happens when we subclass MetaBunch. One might think that
- because Point inherits from MetaBunch
- and because MetaBunch has a __metaclass__ in its body
- and that __metaclass__ has value metaMetaBunch
it follows that metaMetaBunch is use to construct the Point class.
But this is gotcha. Even though the conclusion is correct the reasoning is not. What happens is that
- Python looks for __metaclass__ in the body of Point
- but it’s not there so it looks at the bases of Point
- and in the bases it finds MetaBunch
- whose type is metaMetaBunch
and so it uses that instead of type when constructing Point.
Discussion¶
Here are the main differences between the two approaches.
The decorator approach
- Syntax differs from ordinary class statement.
- Awkward if class decorators are not available.
- As is, the name is not picked up.
- Easier to construct Bunch classes dynamically.
- The Point class is an instance of type.
The __metaclass__ approach
- Syntax the same as ordinary class statement.
- ‘Magic’ takes place behind the scenes.
- Requires more knowledge to implement.
- Awkward to construct Bunch classes dynamically.
- The Point class is an instance of MetaBunch.
My view is that using decorators is simpler than using __metaclass__, particularly if the decorator syntax is available.
JavaScript objects¶
Like Python classes¶
In JavaScript all objects are part of an inheritance tree. The create function adds a node to the inheritance tree.
// A JavaScript object.
js> root = {}
// Prototype inheritance.
js> create = function (obj) {
var f = function () {return this;};
f.prototype = obj;
return new f;
}
js> a = create(root)
js> b = create(a)
js> a.name = 5
js> a.name
5
js> b.name
5
In Python classes inherit in the same way.
>>> root = type # Most classes are instance of type.
>>> class a(root): pass
>>> class b(a): pass # Class inheritance.
>>> a.name = 5 # Just like JavaScript.
>>> a.name
5
>>> b.name
5
class explanation¶
In Python we can subclass anything whose type is type (or a subclass of type). A subclass (and its instances) inherits properties from the super-class.
>>> type(root) == type(a) == type(b) == type
True
Custom item methods¶
In JavaScript attribute and item access are the same.
js> a = create(root)
js> a.name = 5
js> a['name']
5
js> a['key'] = 6
js> a.key
6
js> a[1] = 6
js> a['1']
6
In Python we can defined our own item methods. (The programmer owns the dot.)
>>> class A(object):
...
... def __getitem__(self, key):
... return getattr(self, str(key))
... def __setitem__(self, key, value):
... return setattr(self, str(key), value)
>>> a = A()
>>> a.name = 5
>>> a['name']
5
>>> a['key'] = 6
>>> a.key
6
>>> a[1] = 6
>>> a['1']
6
Because type(a) is A, which has the special item methods, we get the special item behaviour.
>>> type(a) is A
True
On metaclass¶
Using previous definition, we cannot subclass a to create b.
>>> class b(a): pass
Traceback (most recent call last):
class b(a): pass
TypeError: Error when calling the metaclass bases
object.__new__() takes no parameters
This is because a is not a type. The solution involves Python metaclasses (an advanced topic).
>>> isinstance(a, type)
False
metaclass construction¶
We will subclass type, not object, and add to it the special item methods.
>>> class ObjectType(type):
...
... def __getitem__(self, key):
... return getattr(self, str(key))
...
... def __setitem__(self, key, value):
... return setattr(self, str(key), value)
Here is a fancy way of calling ObjectType.
>>> class root(object): ... __metaclass__ = ObjectType
Here is a more direct (and equivalent) construction (create an instance of ObjectType, whose instances are objects).
>>> root = ObjectType('root', (object,), {})
>>> isinstance(root(), object)
True
metaclass demonstration¶
>>> class a(root): pass
>>> class b(a): pass
>>> a.name = 5
>>> a.name
5
>>> b.name
5
>>> a['name']
5
>>> b['name']
5
>>> a[1] = 6
>>> a['1']
6
metaclass explanation¶
Because type(root) is a subclass of type we can subclass root.
>>> issubclass(type(root), type)
True
Because the type(root) is ObjectType, which has special item methods, we get the special item behaviour.
>>> type(root) == type(a) == type(b) == ObjectType
True
Never instantiated¶
We can’t call JavaScript objects (unless they are a function). But create creates ordinary JavaScript objects.
js> a = create(root)
js> a(1, 2, 3)
TypeError: a is not a function
We will monkey-patch the previous Python class, to provide custom behaviour when called.
>>> def raise_not_a_function(obj, *argv, **kwargs):
... raise TypeError, obj.__name__ + ' is not a function'
>>> ObjectType.__call__ = raise_not_a_function
>>> a(1, 2, 3)
Traceback (most recent call last):
a(1, 2, 3)
TypeError: a is not a function
Conclusion¶
JavaScript objects are like Python classes (because they inherit like Python classes).
For JavaScript attribute and item access are the same. This is achieved in Python by providing custom item methods.
In Python the custom item methods must be placed on the type of the object (or a superclass of its type).
Ordinary JavaScript objects are not functions and cannot be called. A Python class can be called (to create an instance of the object). But we can override this behaviour by supplying a custom method for call.
To summarize: ..
JavaScript objects are like Python classes with custom item methods (on the metaclass) that are never instantiated.
It’s worth saying again:
JavaScript objects are like Python classes with custom item methods (on the metaclass) that are never instantiated.
Exercise: A line from a file¶
We want to read lines from one of more files. We want each line to
- be a string
- have a filename attribute
- have a linenumber attribute
Recall that we can already iterate over the lines of a file.
The interface I suggest is
filename = 'myfile.txt'
f = open(filename)
labelled_lines = LabelledLines(f, filename)
The behavior we’d like is for this code
for line in labelled_lines:
print (line.filename, line.linenumber, line)
to produce output like
('myfile.txt', 0, 'First line\n')
('myfile.txt', 1, 'Second line\n')
('myfile.txt', 2, 'Third line\n')
Exercise: Property from class decorator¶
A property is a way of providing a class with virtual or protected attributes. They can be a good way of hiding and protecting implementation details. The area property of a rectangle is a good example of a read-only property.
Some properties are both read and write. There may also be a case for write-only properties. One can also delete a property. To help the programmer, a property can have a docstring.
The signature for property is
property(fget=None, fset=None, fdel=None, doc=None)
The exercise is to make a decorator that simplifies the creation of complex property attributes. The interface I suggest is
class MyClass(object):
@property_from_class
class my_property(object):
'''This is to be the doc string for the property.'''
def fget(self):
pass # code goes here
def fset(self):
pass # code goes here
def fdel(self):
pass # code goes here
Any or all of fget, fset, fdel can be omitted, as can the docstring. It should be an error to ‘use the wrong keyword’.
Exercise: Named integers¶
The bool type produces True and False, which are something like named integers. The exercise is to produce, from something like a mapping, a subclass of int that provides named integers.
Just as True and False look nicer that 0 or 1 when a quantity is a boolean, so named integers look nicer with symbolic constants.
The interface should be something like
my_items = [(0, 'zero'), (1, 'one'), (2, 'two')]
OneTwoThree = whatsit(my_items)
The behavior should be something like
z = OneTwoThree('zero')
str(z) == 'zero'
t = OneTwoThree(2)
str(t) == 'two'
Thus, any string or integer (within range) can be used to produce a named integer. Out of range values should produce an exception.
Exercise: Subset of a set¶
The task here is to produce a memory efficient way of representing a subset of a given base set. We will use bytes, and for simplicity we assume that the base set has at most eight elements.
In Python 2.6 bytes is an alternative name for string, and in Python 3 it is a separate type in its own right (with a somewhat different interface). Please do the exercise in Python 2.6 (or earlier, with bytes equal to str).
The interface I suggest is something like
all_vowels = 'aeiou'
SubsetOfVowels = SubsetClassFactory(all_vowels)
my_vowels = SubsetOfVowels('ie')
set(my_vowels) == set(['i', 'e'])
ord(my_vowels[0]) == 2 + 4
Don’t deal with set operations, such as intersection or complement.
By the way, an application would be dealing with large numbers of subsets of a largish set (set 255 elements), using numpy to store the data and do the hard work. So set operation details would have to fit in with numpy.
Exercise: Class to and from class data¶
Sometimes __metaclass__ is used to amend the operation of a class statement. However always the same change can be done with a decorator function, and often this is clearer and easier.
The exercise here is to produce some class from class decorators. The first task is to produce two decorators whose composition is trivial.
In other words this
@class_from_class_data
@class_data_from_class
class MyClass(object):
pass
should be equivalent to this
class MyClass(object):
pass
Once we have done this, it’s a lot easier to modify classes during construction, because so to speak the input-output has already been dealt with. Simply write a function that changes or creates a class data object.
The decorator function class_data_from_class should produce class_data, which we can regard as a tuple.
The decorator function class_from_class_data should produce a class from the class data.
Note
Don’t assume that the type of MyClass is type. It could be a subclass of type.
Exercise: Your own class to class decorator¶
This is an open exercise. The task is to find a situation where you need to change a class during its construction via a class statement, and then to write a class to class decorator that does this.
Please use the class_to_class_data and class_data_to_class decorators, either the ones supplied by the tutorial or your own (if you think they are better).
Here’s a template, to get you started.
def my_class_decorator(cls):
class_data = class_data_from_class(cls)
new_class_data = whatever_you_want_it_to_be
new_cls = class_from_class_data(new_class_data)
return new_cls
Here are some ideas
- Check that the class supplies certain methods
- Perform other checks on the class
- Change methods so all calls are logged
- Supply extra utility methods to the class
- Refactor existing code that depends on __metaclass__