TAGS :Viewed: 17 - Published at: a few seconds ago

[ The choice of key to keep track of functions already running in recursive calls ]

A function recursive_repr in reprlib module, introduced in Python 3.2, has the following source code:

def recursive_repr(fillvalue='...'):
    'Decorator to make a repr function return fillvalue for a recursive call'

    def decorating_function(user_function):
        repr_running = set()

        def wrapper(self):
            key = id(self), get_ident()
            if key in repr_running:
                return fillvalue
                result = user_function(self)
            return result

        # Can't use functools.wraps() here because of bootstrap issues
        wrapper.__module__ = getattr(user_function, '__module__')
        wrapper.__doc__ = getattr(user_function, '__doc__')
        wrapper.__name__ = getattr(user_function, '__name__')
        wrapper.__annotations__ = getattr(user_function, '__annotations__', {})
        return wrapper

    return decorating_function

The key identifying the specific __repr__ function is set to be (id(self), get_ident()).

Why self itself wasn't used as a key? And why get_ident was needed?

Answer 1

Consider this code:

a = []
b = []
a == b

This causes a stack overflow. But the algorithm needs to handle this case safely. If you put self into the set, it'll compare using == and the algorithm will fail. So we use id(self) instead which doesn't attempt to check for equality of the objects. We only care if it is the exact same object.

As for get_indent, consider what happens in this algorithm if two threads try to use the code at the same time. The repr_running set is shared between all the threads. But if multiple thread start adding and removing elements from that set, there is no telling what will happen. get_ident() is unique to the thread running it, so by using that with the key we know all threads will use different keys and be ok.