注意:此问题仅供引用。我很想知道如何深入了解 Python 的内部结构。
不久前,某家内部开始讨论question关于在调用 print
之后/期间是否可以修改传递给打印语句的字符串已经完成。例如,考虑函数:
def print_something():
print('This cat was scared.')
print
运行,那么终端的输出应该显示:This dog was scared.
There are a couple of ways to do that, but they're all very ugly, and should never be done. The least ugly way is to probably replace the
code
object inside the function with one with a differentco_consts
list. Next is probably reaching into the C API to access the str's internal buffer. [...]
>>> import inspect
>>> exec(inspect.getsource(print_something).replace('cat', 'dog'))
>>> print_something()
This dog was scared.
exec
很糟糕,但这并不能真正回答问题,因为它实际上并没有在何时/之后修改任何内容 print
被称为。最佳答案
首先,实际上有一种不那么笨拙的方法。我们要做的就是改变print
打印,对吧?
_print = print
def print(*args, **kw):
args = (arg.replace('cat', 'dog') if isinstance(arg, str) else arg
for arg in args)
_print(*args, **kw)
sys.stdout
而不是 print
.exec … getsource …
没有问题。想法。好吧,当然它有很多问题,但比下面的要少……bytecode
这样的库。 (完成后)或 byteplay
(在此之前,或对于较旧的 Python 版本)而不是手动执行。即使对于这种微不足道的事情,CodeType
初始化程序很痛苦;如果你真的需要做一些事情,比如修复 lnotab
,只有疯子才会手动完成。import types
def print_function():
print ("This cat was scared.")
def main():
# A function object is a wrapper around a code object, with
# a bit of extra stuff like default values and closure cells.
# See inspect module docs for more details.
co = print_function.__code__
# A code object is a wrapper around a string of bytecode, with a
# whole bunch of extra stuff, including a list of constants used
# by that bytecode. Again see inspect module docs. Anyway, inside
# the bytecode for string (which you can read by typing
# dis.dis(string) in your REPL), there's going to be an
# instruction like LOAD_CONST 1 to load the string literal onto
# the stack to pass to the print function, and that works by just
# reading co.co_consts[1]. So, that's what we want to change.
consts = tuple(c.replace("cat", "dog") if isinstance(c, str) else c
for c in co.co_consts)
# Unfortunately, code objects are immutable, so we have to create
# a new one, copying over everything except for co_consts, which
# we'll replace. And the initializer has a zillion parameters.
# Try help(types.CodeType) at the REPL to see the whole list.
co = types.CodeType(
co.co_argcount, co.co_kwonlyargcount, co.co_nlocals,
co.co_stacksize, co.co_flags, co.co_code,
consts, co.co_names, co.co_varnames, co.co_filename,
co.co_name, co.co_firstlineno, co.co_lnotab,
co.co_freevars, co.co_cellvars)
print_function.__code__ = co
print_function()
main()
RuntimeError
s 吃掉整个堆栈,更正常 RuntimeError
可以处理的 s,或者可能只会引发 TypeError
的垃圾值或 AttributeError
当您尝试使用它们时。例如,尝试创建一个只有 RETURN_VALUE
的代码对象。堆栈上没有任何内容(字节码 b'S\0'
用于 3.6+,b'S'
之前),或带有空元组用于 co_consts
当有 LOAD_CONST 0
在字节码中,或使用 varnames
减1所以最高LOAD_FAST
实际上加载了一个 freevar/cellvar 单元。如果您收到 lnotab
以获得真正的乐趣,错了,您的代码只会在调试器中运行时出现段错误。bytecode
或 byteplay
不会保护您免受所有这些问题的影响,但它们确实有一些基本的健全性检查和不错的助手,可让您执行诸如插入一大块代码并让它担心更新所有偏移量和标签之类的事情,因此您无法获得它错了,等等。 (另外,它们使您不必输入那个荒谬的 6 行构造函数,并且不必调试由此产生的愚蠢的拼写错误。)ctypes
to access that API from within Python itself, which is such a terrible idea that they put a pythonapi
right there in the stdlib's ctypes
module . :) 你需要知道的最重要的技巧是 id(x)
是指向 x
的实际指针在内存中(作为 int
)。superhackyinternals
从我的 GitHub 上进行项目。 (它故意不是 pip 可安装的,因为你真的不应该使用它,除非你在本地构建解释器等等。)import ctypes
import internals # https://github.com/abarnert/superhackyinternals/blob/master/internals.py
def print_function():
print ("This cat was scared.")
def main():
for c in print_function.__code__.co_consts:
if isinstance(c, str):
idx = c.find('cat')
if idx != -1:
# Too much to explain here; just guess and learn to
# love the segfaults...
p = internals.PyUnicodeObject.from_address(id(c))
assert p.compact and p.ascii
addr = id(c) + internals.PyUnicodeObject.utf8_length.offset
buf = (ctypes.c_int8 * 3).from_address(addr + idx)
buf[:3] = b'dog'
print_function()
main()
int
隐藏起来比 str
要简单得多.通过更改 2
的值,可以更容易地猜测您可以破坏什么。至 1
,对吗?实际上,忘记想象,让我们去做吧(再次使用来自 superhackyinternals
的类型):>>> n = 2
>>> pn = PyLongObject.from_address(id(n))
>>> pn.ob_digit[0]
2
>>> pn.ob_digit[0] = 1
>>> 2
1
>>> n * 3
3
>>> i = 10
>>> while i < 40:
... i *= 2
... print(i)
10
10
10
2
在提示下,它进入了某种不可中断的无限循环。大概它使用的号码是 2
用于其 REPL 循环中的某些内容,而股票解释器则不是?
关于python - 是否有可能 "hack"Python 的打印功能?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49271750/