Python 闭包杂谈

来源:转载

引例


学 Python 的人,或许都会遇到这么一个经典的问题。此问题可通过nonlocal
解决


def outer():
n = 0
def inner():
n = n + 1
return n
return inner
outer()()

运行后便会发生如下的异常


Traceback (most recent call last):
File "test.py", line 8, in <module>
outer()()
File "test.py", line 4, in inner
n = n + 1
UnboundLocalError: local variable 'n' referenced before assignment

这个问题和下面这个是类似的


n = 1
def func():
if n == 0:
n = 2
func()


n = 2
是永远不会执行的。但还是会出现相同的异常


或者这个问题


n = 1
def func():
print(n)
n = 2
func()

等等……



JavaScript 也有这类问题,称为 Hoisting 参考自JavaScript 秘密花园


var myvar = 'my value';
(function() {
alert(myvar);// undefined
var myvar = 'local value';
})();

要理解这个问题还是需要一些 PVM 和 字节码的知识的。Python 首先会对源代码进行编译,形成字节码,然后将字节码载入运行。所有的分支都会参与编译的。编译时会生成符号表。详见我的上篇文章


那么,让我们从字节码的角度来看看


In [39]: def outer():
...: m = 0
...: n = 0
...: def inner():
...: n = n + m
...: return inner
...:
...:
In [40]: dis(outer())
5 0 LOAD_FAST0 (n)
3 LOAD_DEREF 0 (m)
6 BINARY_ADD
7 STORE_FAST 0 (n)
10 LOAD_CONST 0 (None)
13 RETURN_VALUE


对于m
和n
, Python 使用了不同的 LOAD 操作


LOAD_FAST
从局部变量表co_varnames
载入
LOAD_DEREF
从 cell 变量和自由变量的存储区载入。cell 变量指被内部作用域所引用的变量,自由变量是引用的外层作用域中的变量。比如定义在外层函数中的m
,对于内层函数说它是自由变量,对于外层函数它是 cell 变量

再来看看符号表


In [41]: c = outer()
In [42]: c.__code__.co_cellvars
Out[42]: ()
In [43]: c.__code__.co_varnames
Out[43]: ('n',)
In [44]: c.__code__.co_freevars
Out[44]: ('m',)
In [45]: outer.__code__.co_cellvars
Out[45]: ('m',)
In [46]: outer.__code__.co_varnames
Out[46]: ('n', 'inner')
In [47]: outer.__code__.co_freevars
Out[47]: ()


对比n
和m
。体会一下,我们可以发现如果在函数中出现赋值语句,则编译器会认为其是函数的一个局部变量,不会认为其是一个自由变量,n
被放到了局部变量表co_varnames
中。而由于 Python 是先执行的n + 1
,先去 LOADn
。但n
并没有绑定任何对象,所以会抛出UnboundLocalError



所以在出现nonlocal
关键字之前, Python 的闭包作用域可以认为是只读的,当然你也可以用可变类型 trick 掉这个机制。PEP 227 -- Statically Nested Scopes
是这样描述这个现象的


If a name is bound anywhere within a code block, all uses of the name within the block are treated as references to the current block.


闭包为何能引用自由变量
In [3]: def outer():
...: n = 0
...: def inner():
...: m = n + 1
...: return inner
...:
In [4]: dis(outer)
2 0 LOAD_CONST 1 (0)
3 STORE_DEREF0 (n)
3 6 LOAD_CLOSURE 0 (n)
9 BUILD_TUPLE1
12 LOAD_CONST 2 (<code object inner at 0x7fb71e64ba50, file "<ipython-input-3-d1d0f7f91b20>", line 3>)
15 LOAD_CONST 3 ('outer.<locals>.inner')
18 MAKE_CLOSURE 0
21 STORE_FAST 0 (inner)
524 LOAD_FAST0 (inner)
27 RETURN_VALUE
STORE_DEREF(i)
Stores TOS into the cell contained in sloti
of the cell and free variable storage.
LOAD_CLOSURE(i)
Pushes a reference to the cell contained in sloti
of the cell and free variable storage. The name of the variable isco_cellvars[i]
ifi
is less than the length ofco_cellvars
. Otherwise it isco_freevars[i - len(co_cellvars)]
.
MAKE_CLOSURE(argc)
Creates a new function object, sets its__closure__
slot, and pushes it on the stack. TOS is the qualified name of the function, TOS1 is the code associated with the function, and TOS2 is the tuple containing cells for the closure’s free variables. argc is interpreted as inMAKE_FUNCTION
; the annotations and defaults are also in the same order below TOS2.


上述字节码做的就是将闭包引用到的变量组成元组然后赋给闭包的__closure__


In [5]: outer().__closure__
Out[5]: (<cell at 0x7fb71e6598b8: int object at 0x8c1620>,)
In [6]: outer().__closure__[0].cell_contents
Out[6]: 0


__closure__
对应着 Python 查找变量的顺序local -> enclosing -> global -> builtin
中的 enclosing,这便是能够引用到自由变量的理由


分享给朋友:
您可能感兴趣的文章:
随机阅读: