Python 官方简明教程 11:修订间差异

来自Ubuntu中文
跳到导航跳到搜索
Yq-ysy留言 | 贡献
以内容'=-11. 标准库概要-第二部分 Brief Tour of the Standard Library – Part II(已翻译,尚未校对)= '''感谢“[http://www.pydn.cn/forum.php 中译社]”翻…'创建新页面
 
Yq-ysy留言 | 贡献
 
(未显示同一用户的2个中间版本)
第1行: 第1行:
=-11. 标准库概要-第二部分 Brief Tour of the Standard Library – Part II(已翻译,尚未校对)=
=-11. 标准库概要-第二部分 Brief Tour of the Standard Library – Part II(已翻译,尚未校对)=
'''感谢“[http://www.pydn.cn/forum.php 中译社]”翻译本页(尚未校对),欢迎大家到[http://www.pydn.cn/forum.php Python开发者网络PYDN]参加 Pythn 3.2 文档的翻译工作'''
'''感谢“[http://www.pydn.cn/forum.php 中译社]”翻译本页(尚未校对),可以到[http://docspy3zh.readthedocs.org/en/latest/tutorial/index.html readthedocs.org]查看更新更专业的翻译版本'''


This second tour covers more advanced modules that support professional programming needs. These modules rarely occur in small scripts.
This second tour covers more advanced modules that support professional programming needs. These modules rarely occur in small scripts.


第二部分包含了支持专业编程工作所需的更高级的模块,这些模块很少出现在小脚本中。
第二部分包含了支持专业编程工作所需的更高级的模块,这些模块很少出现在小脚本中。
11.1. 输出格式
==-11.1. 输出格式 Output Formatting==


The reprlib module provides a version of repr() customized for abbreviated displays of large or deeply nested containers:
The reprlib module provides a version of repr() customized for abbreviated displays of large or deeply nested containers:
第11行: 第11行:
repr提供了一个 repr() 的定制版本,以显示大型或深度嵌套的容器:
repr提供了一个 repr() 的定制版本,以显示大型或深度嵌套的容器:


>>> import reprlib
>>> import reprlib
>>> reprlib.repr(set('supercalifragilisticexpialidocious'))
>>> reprlib.repr(set('supercalifragilisticexpialidocious'))
"set(['a', 'c', 'd', 'e', 'f', 'g', ...])"
"set(['a', 'c', 'd', 'e', 'f', 'g', ...])"


The pprint module offers more sophisticated control over printing both built-in and user defined objects in a way that is readable by the interpreter. When the result is longer than one line, the “pretty printer” adds line breaks and indentation to more clearly reveal data structure:
The pprint module offers more sophisticated control over printing both built-in and user defined objects in a way that is readable by the interpreter. When the result is longer than one line, the “pretty printer” adds line breaks and indentation to more clearly reveal data structure:
第19行: 第19行:
pprint模块给老手提供了一种解释器可读的方式深入控制内置和用户自定义对象的打印。当输出超过一行的时候,“美化打印(pretty printer)”添加断行和标识符,使得数据结构显示的更清晰:
pprint模块给老手提供了一种解释器可读的方式深入控制内置和用户自定义对象的打印。当输出超过一行的时候,“美化打印(pretty printer)”添加断行和标识符,使得数据结构显示的更清晰:


>>> import pprint
>>> import pprint
>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
...    'yellow'], 'blue']]]
...    'yellow'], 'blue']]]
...
...
>>> pprint.pprint(t, width=30)
>>> pprint.pprint(t, width=30)
[[[['black', 'cyan'],
[[[['black', 'cyan'],
  'white',
    'white',
  ['green', 'red']],
    ['green', 'red']],
  [['magenta', 'yellow'],
  [['magenta', 'yellow'],
  'blue']]]
    'blue']]]


The textwrap module formats paragraphs of text to fit a given screen width:
The textwrap module formats paragraphs of text to fit a given screen width:
第34行: 第34行:
textwrap模块格式化文本段落以适应设定的屏宽:
textwrap模块格式化文本段落以适应设定的屏宽:


>>> import textwrap
>>> import textwrap
>>> doc = """The wrap() method is just like fill() except that it returns
>>> doc = """The wrap() method is just like fill() except that it returns
... a list of strings instead of one big string with newlines to separate
... a list of strings instead of one big string with newlines to separate
... the wrapped lines."""
... the wrapped lines."""
...
...
>>> print(textwrap.fill(doc, width=40))
>>> print(textwrap.fill(doc, width=40))
The wrap() method is just like fill()
The wrap() method is just like fill()
except that it returns a list of strings
except that it returns a list of strings
instead of one big string with newlines
instead of one big string with newlines
to separate the wrapped lines.
to separate the wrapped lines.


The locale module accesses a database of culture specific data formats. The grouping attribute of locale’s format function provides a direct way of formatting numbers with group separators:
The locale module accesses a database of culture specific data formats. The grouping attribute of locale’s format function provides a direct way of formatting numbers with group separators:
第49行: 第49行:
locale模块按访问预定好的国家信息数据库。locale 的格式化函数属性集提供了一个直接方式以分组标示格式化数字:
locale模块按访问预定好的国家信息数据库。locale 的格式化函数属性集提供了一个直接方式以分组标示格式化数字:


>>> import locale
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
'English_United States.1252'
'English_United States.1252'
>>> conv = locale.localeconv()          # get a mapping of conventions
>>> conv = locale.localeconv()          # get a mapping of conventions
>>> x = 1234567.8
>>> x = 1234567.8
>>> locale.format("%d", x, grouping=True)
>>> locale.format("%d", x, grouping=True)
'1,234,567'
'1,234,567'
>>> locale.format_string("%s%.*f", (conv['currency_symbol'],
>>> locale.format_string("%s%.*f", (conv['currency_symbol'],
...                      conv['frac_digits'], x), grouping=True)
...                      conv['frac_digits'], x), grouping=True)
'$1,234,567.80'
'$1,234,567.80'


11.2. 模版
==-11.2. 模版 Templating==


The string module includes a versatile Template class with a simplified syntax suitable for editing by end-users. This allows users to customize their applications without having to alter the application.
The string module includes a versatile Template class with a simplified syntax suitable for editing by end-users. This allows users to customize their applications without having to alter the application.
第70行: 第70行:
格式使用 $ 为开头的 Python 合法标识(数字、字母和下划线)作为占位符。占位符外面的大括号使它可以和其它的字符不加空格混在一起。 $$ 创建一个单独的 “$”:
格式使用 $ 为开头的 Python 合法标识(数字、字母和下划线)作为占位符。占位符外面的大括号使它可以和其它的字符不加空格混在一起。 $$ 创建一个单独的 “$”:


>>> from string import Template
>>> from string import Template
>>> t = Template('${village}folk send $$10 to $cause.')
>>> t = Template('${village}folk send $$10 to $cause.')
>>> t.substitute(village='Nottingham', cause='the ditch fund')
>>> t.substitute(village='Nottingham', cause='the ditch fund')
'Nottinghamfolk send $10 to the ditch fund.'
'Nottinghamfolk send $10 to the ditch fund.'


The substitute() method raises a KeyError when a placeholder is not supplied in a dictionary or a keyword argument. For mail-merge style applications, user supplied data may be incomplete and the safe_substitute() method may be more appropriate — it will leave placeholders unchanged if data is missing:
The substitute() method raises a KeyError when a placeholder is not supplied in a dictionary or a keyword argument. For mail-merge style applications, user supplied data may be incomplete and the safe_substitute() method may be more appropriate — it will leave placeholders unchanged if data is missing:
第79行: 第79行:
字典或者关键字参数中缺少某个占位符的时候 substitute() 方法抛出 KeyError 异常。在邮件-合并风格的应用程序中,用户提供的数据可能并不完整,也许用 safe-substitute() 方法更合适——如果数据不完整,它保留未改动的占位符:
字典或者关键字参数中缺少某个占位符的时候 substitute() 方法抛出 KeyError 异常。在邮件-合并风格的应用程序中,用户提供的数据可能并不完整,也许用 safe-substitute() 方法更合适——如果数据不完整,它保留未改动的占位符:


>>> t = Template('Return the $item to $owner.')
>>> t = Template('Return the $item to $owner.')
>>> d = dict(item='unladen swallow')
>>> d = dict(item='unladen swallow')
>>> t.substitute(d)
>>> t.substitute(d)
Traceback (most recent call last):
Traceback (most recent call last):
  . . .
  . . .
KeyError: 'owner'
KeyError: 'owner'
>>> t.safe_substitute(d)
>>> t.safe_substitute(d)
'Return the unladen swallow to $owner.'
'Return the unladen swallow to $owner.'


Template subclasses can specify a custom delimiter. For example, a batch renaming utility for a photo browser may elect to use percent signs for placeholders such as the current date, image sequence number, or file format:
Template subclasses can specify a custom delimiter. For example, a batch renaming utility for a photo browser may elect to use percent signs for placeholders such as the current date, image sequence number, or file format:
第98行: 第98行:
     >>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format):  ')
     >>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format):  ')
     Enter rename style (%d-date %n-seqnum %f-format):  Ashley_%n%f
     Enter rename style (%d-date %n-seqnum %f-format):  Ashley_%n%f
 
     >>> t = BatchRename(fmt)
     >>> t = BatchRename(fmt)
     >>> date = time.strftime('%d%b%y')
     >>> date = time.strftime('%d%b%y')
第105行: 第105行:
     ...    newname = t.substitute(d=date, n=i, f=ext)
     ...    newname = t.substitute(d=date, n=i, f=ext)
     ...    print('{0} --> {1}'.format(filename, newname))
     ...    print('{0} --> {1}'.format(filename, newname))
 
    img_1074.jpg –> Ashley_0.jpg img_1076.jpg –> Ashley_1.jpg img_1077.jpg –> Ashley_2.jpg
  img_1074.jpg –> Ashley_0.jpg
  img_1076.jpg –> Ashley_1.jpg
  img_1077.jpg –> Ashley_2.jpg


Another application for templating is separating program logic from the details of multiple output formats. This makes it possible to substitute custom templates for XML files, plain text reports, and HTML web reports.
Another application for templating is separating program logic from the details of multiple output formats. This makes it possible to substitute custom templates for XML files, plain text reports, and HTML web reports.


另一个应用是将多样化的输出格式细节从程序逻辑中分离出来。这使得为 XML 文件,纯文本报表,HTML web 报表定制替换模版成为可能。
另一个应用是将多样化的输出格式细节从程序逻辑中分离出来。这使得为 XML 文件,纯文本报表,HTML web 报表定制替换模版成为可能。
11.3. 使用二进制记录层
==-11.3. 使用二进制记录层 Working with Binary Data Record Layouts==


The struct module provides pack() and unpack() functions for working with variable length binary record formats. The following example shows how to loop through header information in a ZIP file without using the zipfile module. Pack codes "H" and "I" represent two and four byte unsigned numbers respectively. The "<" indicates that they are standard size and in little-endian byte order:
The struct module provides pack() and unpack() functions for working with variable length binary record formats. The following example shows how to loop through header information in a ZIP file without using the zipfile module. Pack codes "H" and "I" represent two and four byte unsigned numbers respectively. The "<" indicates that they are standard size and in little-endian byte order:
第117行: 第119行:
struct模块提供 :pack 和 unpack函数用于变长二进制记录格式。以下示例显示了如何通过ZIP文件的头信息(压缩代码中的 “H” 和 “L” 分别传递二和四字节无符号整数)权文博:
struct模块提供 :pack 和 unpack函数用于变长二进制记录格式。以下示例显示了如何通过ZIP文件的头信息(压缩代码中的 “H” 和 “L” 分别传递二和四字节无符号整数)权文博:


import struct
import struct
 
data = open('myfile.zip', 'rb').read()
data = open('myfile.zip', 'rb').read()
start = 0
start = 0
for i in range(3):                      # show the first 3 file headers
for i in range(3):                      # show the first 3 file headers
    start += 14
    start += 14
    fields = struct.unpack('<IIIHH', data[start:start+16])
    fields = struct.unpack('<IIIHH', data[start:start+16])
    crc32, comp_size, uncomp_size, filenamesize, extra_size = fields
    crc32, comp_size, uncomp_size, filenamesize, extra_size = fields
 
    start += 16
    start += 16
    filename = data[start:start+filenamesize]
    filename = data[start:start+filenamesize]
    start += filenamesize
    start += filenamesize
    extra = data[start:start+extra_size]
    extra = data[start:start+extra_size]
    print(filename, hex(crc32), comp_size, uncomp_size)
    print(filename, hex(crc32), comp_size, uncomp_size)
 
    start += extra_size + comp_size    # skip to the next header
    start += extra_size + comp_size    # skip to the next header


11.4. 多线程
==-11.4. 多线程 Multi-threading==


Threading is a technique for decoupling tasks which are not sequentially dependent. Threads can be used to improve the responsiveness of applications that accept user input while other tasks run in the background. A related use case is running I/O in parallel with computations in another thread.
Threading is a technique for decoupling tasks which are not sequentially dependent. Threads can be used to improve the responsiveness of applications that accept user input while other tasks run in the background. A related use case is running I/O in parallel with computations in another thread.
第144行: 第146行:
下面的代码显示了高级模块 :mod`threading` 如何在主程序运行的同时运行任务:
下面的代码显示了高级模块 :mod`threading` 如何在主程序运行的同时运行任务:


import threading, zipfile
import threading, zipfile
 
class AsyncZip(threading.Thread):
class AsyncZip(threading.Thread):
    def __init__(self, infile, outfile):
    def __init__(self, infile, outfile):
        threading.Thread.__init__(self)
        threading.Thread.__init__(self)
        self.infile = infile
        self.infile = infile
        self.outfile = outfile
        self.outfile = outfile
    def run(self):
    def run(self):
        f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
        f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
        f.write(self.infile)
        f.write(self.infile)
        f.close()
        f.close()
        print('Finished background zip of:', self.infile)
        print('Finished background zip of:', self.infile)
 
background = AsyncZip('mydata.txt', 'myarchive.zip')
background = AsyncZip('mydata.txt', 'myarchive.zip')
background.start()
background.start()
print('The main program continues to run in foreground.')
print('The main program continues to run in foreground.')
 
background.join()    # Wait for the background task to finish
background.join()    # Wait for the background task to finish
print('Main program waited until background was done.')
print('Main program waited until background was done.')


The principal challenge of multi-threaded applications is coordinating threads that share data or other resources. To that end, the threading module provides a number of synchronization primitives including locks, events, condition variables, and semaphores.
The principal challenge of multi-threaded applications is coordinating threads that share data or other resources. To that end, the threading module provides a number of synchronization primitives including locks, events, condition variables, and semaphores.
第171行: 第173行:


尽管工具很强大,但微小的设计错误也可能造成难以挽回的故障。因此,更好的方法是将所有的资源访问集中到一个独立的线程中,然后使用 Queue 模块调度该线程相应其它线程的请求。应用程序使用 Queue 对象可以让内部线程通信和协调更容易设计,更可读,更可靠。
尽管工具很强大,但微小的设计错误也可能造成难以挽回的故障。因此,更好的方法是将所有的资源访问集中到一个独立的线程中,然后使用 Queue 模块调度该线程相应其它线程的请求。应用程序使用 Queue 对象可以让内部线程通信和协调更容易设计,更可读,更可靠。
11.5. 日志
==-11.5. 日志 Logging==


The logging module offers a full featured and flexible logging system. At its simplest, log messages are sent to a file or to sys.stderr:
The logging module offers a full featured and flexible logging system. At its simplest, log messages are sent to a file or to sys.stderr:
第177行: 第179行:
logging模块提供了完整和灵活的日志系统。它最简单的用法是记录信息并发送到一个文件或 ``sys.stderr`:
logging模块提供了完整和灵活的日志系统。它最简单的用法是记录信息并发送到一个文件或 ``sys.stderr`:


import logging
import logging
logging.debug('Debugging information')
logging.debug('Debugging information')
logging.info('Informational message')
logging.info('Informational message')
logging.warning('Warning:config file %s not found', 'server.conf')
logging.warning('Warning:config file %s not found', 'server.conf')
logging.error('Error occurred')
logging.error('Error occurred')
logging.critical('Critical error -- shutting down')
logging.critical('Critical error -- shutting down')


This produces the following output:
This produces the following output:
第188行: 第190行:
这里是输出:
这里是输出:


WARNING:root:Warning:config file server.conf not found
WARNING:root:Warning:config file server.conf not found
ERROR:root:Error occurred
ERROR:root:Error occurred
CRITICAL:root:Critical error -- shutting down
CRITICAL:root:Critical error -- shutting down


By default, informational and debugging messages are suppressed and the output is sent to standard error. Other output options include routing messages through email, datagrams, sockets, or to an HTTP Server. New filters can select different routing based on message priority: DEBUG, INFO, WARNING, ERROR, and CRITICAL.
By default, informational and debugging messages are suppressed and the output is sent to standard error. Other output options include routing messages through email, datagrams, sockets, or to an HTTP Server. New filters can select different routing based on message priority: DEBUG, INFO, WARNING, ERROR, and CRITICAL.
第199行: 第201行:


日志系统可以直接在 Python 中定制,也可以不经过应用程序直接在一个用户可编辑的配置文件中加载。
日志系统可以直接在 Python 中定制,也可以不经过应用程序直接在一个用户可编辑的配置文件中加载。
11.6. 弱引用
==-11.6. 弱引用 Weak References==


Python does automatic memory management (reference counting for most objects and garbage collection to eliminate cycles). The memory is freed shortly after the last reference to it has been eliminated.
Python does automatic memory management (reference counting for most objects and garbage collection to eliminate cycles). The memory is freed shortly after the last reference to it has been eliminated.
第209行: 第211行:
这个工作方式对大多数应用程序工作良好,但是偶尔会需要跟踪对象来做一些事。不幸的是,仅仅为跟踪它们创建引用也会使其长期存在。 weakref 模块提供了不用创建引用的跟踪对象工具,一旦对象不再存在,它自动从弱引用表上删除并触发回调。典型的应用包括捕获难以构造的对象:
这个工作方式对大多数应用程序工作良好,但是偶尔会需要跟踪对象来做一些事。不幸的是,仅仅为跟踪它们创建引用也会使其长期存在。 weakref 模块提供了不用创建引用的跟踪对象工具,一旦对象不再存在,它自动从弱引用表上删除并触发回调。典型的应用包括捕获难以构造的对象:


>>> import weakref, gc
>>> import weakref, gc
>>> class A:
>>> class A:
...    def __init__(self, value):
...    def __init__(self, value):
...            self.value = value
...            self.value = value
...    def __repr__(self):
...    def __repr__(self):
...            return str(self.value)
...            return str(self.value)
...
...
>>> a = A(10)                  # create a reference
>>> a = A(10)                  # create a reference
>>> d = weakref.WeakValueDictionary()
>>> d = weakref.WeakValueDictionary()
>>> d['primary'] = a            # does not create a reference
>>> d['primary'] = a            # does not create a reference
>>> d['primary']                # fetch the object if it is still alive
>>> d['primary']                # fetch the object if it is still alive
10
10
>>> del a                      # remove the one reference
>>> del a                      # remove the one reference
>>> gc.collect()                # run garbage collection right away
>>> gc.collect()                # run garbage collection right away
0
0
>>> d['primary']                # entry was automatically removed
>>> d['primary']                # entry was automatically removed
Traceback (most recent call last):
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <module>
    d['primary']                # entry was automatically removed
    d['primary']                # entry was automatically removed
  File "C:/python31/lib/weakref.py", line 46, in __getitem__
  File "C:/python31/lib/weakref.py", line 46, in __getitem__
    o = self.data[key]()
    o = self.data[key]()
KeyError: 'primary'
KeyError: 'primary'


11.7. 链表工具
==-11.7. 链表工具 Tools for Working with Lists==


Many data structure needs can be met with the built-in list type. However, sometimes there is a need for alternative implementations with different performance trade-offs.
Many data structure needs can be met with the built-in list type. However, sometimes there is a need for alternative implementations with different performance trade-offs.
第242行: 第244行:
array模块提供了一个类似链表的array()对象,它仅仅是存储数据,更为紧凑。以下的示例演示了一个存储双字节无符号整数的数组(类型编码 ``“H”``)而非存储16字节 Python 整数对象的普通正规链表:
array模块提供了一个类似链表的array()对象,它仅仅是存储数据,更为紧凑。以下的示例演示了一个存储双字节无符号整数的数组(类型编码 ``“H”``)而非存储16字节 Python 整数对象的普通正规链表:


>>> from array import array
>>> from array import array
>>> a = array('H', [4000, 10, 700, 22222])
>>> a = array('H', [4000, 10, 700, 22222])
>>> sum(a)
>>> sum(a)
26932
26932
>>> a[1:3]
>>> a[1:3]
array('H', [10, 700])
array('H', [10, 700])


The collections module provides a deque() object that is like a list with faster appends and pops from the left side but slower lookups in the middle. These objects are well suited for implementing queues and breadth first tree searches:
The collections module provides a deque() object that is like a list with faster appends and pops from the left side but slower lookups in the middle. These objects are well suited for implementing queues and breadth first tree searches:
第253行: 第255行:
collections模块提供了类似链表的`deque() 对象,它从左边添加(append)和弹出(pop)更快,但是在内部查询更慢。这些对象更适用于队列实现和广度优先的树搜索:
collections模块提供了类似链表的`deque() 对象,它从左边添加(append)和弹出(pop)更快,但是在内部查询更慢。这些对象更适用于队列实现和广度优先的树搜索:


>>> from collections import deque
>>> from collections import deque
>>> d = deque(["task1", "task2", "task3"])
>>> d = deque(["task1", "task2", "task3"])
>>> d.append("task4")
>>> d.append("task4")
>>> print("Handling", d.popleft())
>>> print("Handling", d.popleft())
Handling task1
Handling task1
 
unsearched = deque([starting_node])
unsearched = deque([starting_node])
def breadth_first_search(unsearched):
def breadth_first_search(unsearched):
    node = unsearched.popleft()
    node = unsearched.popleft()
    for m in gen_moves(node):
    for m in gen_moves(node):
        if is_goal(m):
        if is_goal(m):
            return m
            return m
        unsearched.append(m)
        unsearched.append(m)


In addition to alternative list implementations, the library also offers other tools such as the bisect module with functions for manipulating sorted lists:
In addition to alternative list implementations, the library also offers other tools such as the bisect module with functions for manipulating sorted lists:
第271行: 第273行:
除了链表的替代实现,该库还提供了 bisect 这样的模块以操作存储链表:
除了链表的替代实现,该库还提供了 bisect 这样的模块以操作存储链表:


>>> import bisect
>>> import bisect
>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>> bisect.insort(scores, (300, 'ruby'))
>>> bisect.insort(scores, (300, 'ruby'))
>>> scores
>>> scores
[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]
[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]


The heapq module provides functions for implementing heaps based on regular lists. The lowest valued entry is always kept at position zero. This is useful for applications which repeatedly access the smallest element but do not want to run a full list sort:
The heapq module provides functions for implementing heaps based on regular lists. The lowest valued entry is always kept at position zero. This is useful for applications which repeatedly access the smallest element but do not want to run a full list sort:
第281行: 第283行:
heapq提供了基于正规链表的堆实现。最小的值总是保持在0点。这在希望循环访问最小元素但是不想执行完整堆排序的时候非常有用:
heapq提供了基于正规链表的堆实现。最小的值总是保持在0点。这在希望循环访问最小元素但是不想执行完整堆排序的时候非常有用:


>>> from heapq import heapify, heappop, heappush
>>> from heapq import heapify, heappop, heappush
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> heapify(data)                      # rearrange the list into heap order
>>> heapify(data)                      # rearrange the list into heap order
>>> heappush(data, -5)                # add a new entry
>>> heappush(data, -5)                # add a new entry
>>> [heappop(data) for i in range(3)]  # fetch the three smallest entries
>>> [heappop(data) for i in range(3)]  # fetch the three smallest entries
[-5, 0, 1]
[-5, 0, 1]


11.8. 十进制浮点数算法
==-11.8. 十进制浮点数算法 Decimal Floating Point Arithmetic==


The decimal module offers a Decimal datatype for decimal floating point arithmetic. Compared to the built-in float implementation of binary floating point, the class is especially helpful for
The decimal module offers a Decimal datatype for decimal floating point arithmetic. Compared to the built-in float implementation of binary floating point, the class is especially helpful for


    financial applications and other uses which require exact decimal representation,
*financial applications and other uses which require exact decimal representation,
*control over precision,
*control over rounding to meet legal or regulatory requirements,
*tracking of significant decimal places, or
*applications where the user expects the results to match calculations done by hand.


    control over precision,
decimal 模块提供了一个 Decimal 数据类型用于浮点数计算。相比内置的二进制浮点数实现 `float`,新类型特别适用于:
 
*金融应用和其它需要精确十进制表达的场合,
    control over rounding to meet legal or regulatory requirements,
*控制精度,
 
*控制舍入以适应法律或者规定要求,
    tracking of significant decimal places, or
*确保十进制数位精度,
 
*或者用户希望用作数学计算的场合。
    applications where the user expects the results to match calculations done by hand.
 
    decimal 模块提供了一个 Decimal 数据类型用于浮点数计算。相比内置的二进制浮点数实现 `float`,新类型特别适用于:
 
    金融应用和其它需要精确十进制表达的场合,
 
    控制精度,
 
    控制舍入以适应法律或者规定要求,
 
    确保十进制数位精度,
 
    或者用户希望用作数学计算的场合。


For example, calculating a 5% tax on a 70 cent phone charge gives different results in decimal floating point and binary floating point. The difference becomes significant if the results are rounded to the nearest cent:
For example, calculating a 5% tax on a 70 cent phone charge gives different results in decimal floating point and binary floating point. The difference becomes significant if the results are rounded to the nearest cent:
第318行: 第311行:
例如,计算 70 分电话费的 5% 税计算,十进制浮点数和二进制浮点数计算结果的差别如下。如果在分值上舍入,这个差别就很重要了:
例如,计算 70 分电话费的 5% 税计算,十进制浮点数和二进制浮点数计算结果的差别如下。如果在分值上舍入,这个差别就很重要了:


>>> from decimal import *
>>> from decimal import *
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
Decimal('0.74')
Decimal('0.74')
>>> round(.70 * 1.05, 2)
>>> round(.70 * 1.05, 2)
0.73
0.73


The Decimal result keeps a trailing zero, automatically inferring four place significance from multiplicands with two place significance. Decimal reproduces mathematics as done by hand and avoids issues that can arise when binary floating point cannot exactly represent decimal quantities.
The Decimal result keeps a trailing zero, automatically inferring four place significance from multiplicands with two place significance. Decimal reproduces mathematics as done by hand and avoids issues that can arise when binary floating point cannot exactly represent decimal quantities.
第332行: 第325行:
高精度使 Decimal 可以执行二进制浮点数无法进行的模运算和等值测试:
高精度使 Decimal 可以执行二进制浮点数无法进行的模运算和等值测试:


>>> Decimal('1.00') % Decimal('.10')
>>> Decimal('1.00') % Decimal('.10')
Decimal('0.00')
Decimal('0.00')
>>> 1.00 % 0.10
>>> 1.00 % 0.10
0.09999999999999995
0.09999999999999995
 
>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
True
True
>>> sum([0.1]*10) == 1.0
>>> sum([0.1]*10) == 1.0
False
False


The decimal module provides arithmetic with as much precision as needed:
The decimal module provides arithmetic with as much precision as needed:
第346行: 第339行:
decimal提供了高精度算法:
decimal提供了高精度算法:


>>> getcontext().prec = 36
>>> getcontext().prec = 36
>>> Decimal(1) / Decimal(7)
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857')
Decimal('0.142857142857142857142857142857142857')





2012年5月30日 (三) 07:52的最新版本

-11. 标准库概要-第二部分 Brief Tour of the Standard Library – Part II(已翻译,尚未校对)

感谢“中译社”翻译本页(尚未校对),可以到readthedocs.org查看更新更专业的翻译版本

This second tour covers more advanced modules that support professional programming needs. These modules rarely occur in small scripts.

第二部分包含了支持专业编程工作所需的更高级的模块,这些模块很少出现在小脚本中。

-11.1. 输出格式 Output Formatting

The reprlib module provides a version of repr() customized for abbreviated displays of large or deeply nested containers:

repr提供了一个 repr() 的定制版本,以显示大型或深度嵌套的容器:

>>> import reprlib
>>> reprlib.repr(set('supercalifragilisticexpialidocious'))
"set(['a', 'c', 'd', 'e', 'f', 'g', ...])"

The pprint module offers more sophisticated control over printing both built-in and user defined objects in a way that is readable by the interpreter. When the result is longer than one line, the “pretty printer” adds line breaks and indentation to more clearly reveal data structure:

pprint模块给老手提供了一种解释器可读的方式深入控制内置和用户自定义对象的打印。当输出超过一行的时候,“美化打印(pretty printer)”添加断行和标识符,使得数据结构显示的更清晰:

>>> import pprint
>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
...     'yellow'], 'blue']]]
...
>>> pprint.pprint(t, width=30)
[[[['black', 'cyan'],
   'white',
   ['green', 'red']],
  [['magenta', 'yellow'],
   'blue']]]

The textwrap module formats paragraphs of text to fit a given screen width:

textwrap模块格式化文本段落以适应设定的屏宽:

>>> import textwrap
>>> doc = """The wrap() method is just like fill() except that it returns
... a list of strings instead of one big string with newlines to separate
... the wrapped lines."""
...
>>> print(textwrap.fill(doc, width=40))
The wrap() method is just like fill()
except that it returns a list of strings
instead of one big string with newlines
to separate the wrapped lines.

The locale module accesses a database of culture specific data formats. The grouping attribute of locale’s format function provides a direct way of formatting numbers with group separators:

locale模块按访问预定好的国家信息数据库。locale 的格式化函数属性集提供了一个直接方式以分组标示格式化数字:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
'English_United States.1252'
>>> conv = locale.localeconv()          # get a mapping of conventions
>>> x = 1234567.8
>>> locale.format("%d", x, grouping=True)
'1,234,567'
>>> locale.format_string("%s%.*f", (conv['currency_symbol'],
...                      conv['frac_digits'], x), grouping=True)
'$1,234,567.80'

-11.2. 模版 Templating

The string module includes a versatile Template class with a simplified syntax suitable for editing by end-users. This allows users to customize their applications without having to alter the application.

string 提供了一个灵活多变的模版类 :template,使用它最终用户可以用简单的进行编辑。这使用户可以在不进行改变的情况下定制他们的应用程序。

The format uses placeholder names formed by $ with valid Python identifiers (alphanumeric characters and underscores). Surrounding the placeholder with braces allows it to be followed by more alphanumeric letters with no intervening spaces. Writing $$ creates a single escaped $:

格式使用 $ 为开头的 Python 合法标识(数字、字母和下划线)作为占位符。占位符外面的大括号使它可以和其它的字符不加空格混在一起。 $$ 创建一个单独的 “$”:

>>> from string import Template
>>> t = Template('${village}folk send $$10 to $cause.')
>>> t.substitute(village='Nottingham', cause='the ditch fund')
'Nottinghamfolk send $10 to the ditch fund.'

The substitute() method raises a KeyError when a placeholder is not supplied in a dictionary or a keyword argument. For mail-merge style applications, user supplied data may be incomplete and the safe_substitute() method may be more appropriate — it will leave placeholders unchanged if data is missing:

字典或者关键字参数中缺少某个占位符的时候 substitute() 方法抛出 KeyError 异常。在邮件-合并风格的应用程序中,用户提供的数据可能并不完整,也许用 safe-substitute() 方法更合适——如果数据不完整,它保留未改动的占位符:

>>> t = Template('Return the $item to $owner.')
>>> d = dict(item='unladen swallow')
>>> t.substitute(d)
Traceback (most recent call last):
  . . .
KeyError: 'owner'
>>> t.safe_substitute(d)
'Return the unladen swallow to $owner.'

Template subclasses can specify a custom delimiter. For example, a batch renaming utility for a photo browser may elect to use percent signs for placeholders such as the current date, image sequence number, or file format:

模版子类可以指定一个定制分隔符。例如,图像浏览器的批量命名工具可能选用百分号作为表示当前日期、图像序列号或文件格式的占位符:

   >>> import time, os.path
   >>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
   >>> class BatchRename(Template):
   ...     delimiter = '%'
   >>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format):  ')
   Enter rename style (%d-date %n-seqnum %f-format):  Ashley_%n%f

   >>> t = BatchRename(fmt)
   >>> date = time.strftime('%d%b%y')
   >>> for i, filename in enumerate(photofiles):
   ...     base, ext = os.path.splitext(filename)
   ...     newname = t.substitute(d=date, n=i, f=ext)
   ...     print('{0} --> {1}'.format(filename, newname))

 img_1074.jpg –> Ashley_0.jpg
 img_1076.jpg –> Ashley_1.jpg
 img_1077.jpg –> Ashley_2.jpg

Another application for templating is separating program logic from the details of multiple output formats. This makes it possible to substitute custom templates for XML files, plain text reports, and HTML web reports.

另一个应用是将多样化的输出格式细节从程序逻辑中分离出来。这使得为 XML 文件,纯文本报表,HTML web 报表定制替换模版成为可能。

-11.3. 使用二进制记录层 Working with Binary Data Record Layouts

The struct module provides pack() and unpack() functions for working with variable length binary record formats. The following example shows how to loop through header information in a ZIP file without using the zipfile module. Pack codes "H" and "I" represent two and four byte unsigned numbers respectively. The "<" indicates that they are standard size and in little-endian byte order:

struct模块提供 :pack 和 unpack函数用于变长二进制记录格式。以下示例显示了如何通过ZIP文件的头信息(压缩代码中的 “H” 和 “L” 分别传递二和四字节无符号整数)权文博:

import struct

data = open('myfile.zip', 'rb').read()
start = 0
for i in range(3):                      # show the first 3 file headers
    start += 14
    fields = struct.unpack('<IIIHH', data[start:start+16])
    crc32, comp_size, uncomp_size, filenamesize, extra_size = fields

    start += 16
    filename = data[start:start+filenamesize]
    start += filenamesize
    extra = data[start:start+extra_size]
    print(filename, hex(crc32), comp_size, uncomp_size)

    start += extra_size + comp_size     # skip to the next header

-11.4. 多线程 Multi-threading

Threading is a technique for decoupling tasks which are not sequentially dependent. Threads can be used to improve the responsiveness of applications that accept user input while other tasks run in the background. A related use case is running I/O in parallel with computations in another thread.

线程是一个分离无顺序依赖关系任务的技术。在某些任务运行于后台的时候应用程序会变得迟缓,线程可以提升其速度。一个相关的应用是在I/O的同时其它线程可以并行计算。

The following code shows how the high level threading module can run tasks in background while the main program continues to run:

下面的代码显示了高级模块 :mod`threading` 如何在主程序运行的同时运行任务:

import threading, zipfile

class AsyncZip(threading.Thread):
    def __init__(self, infile, outfile):
        threading.Thread.__init__(self)
        self.infile = infile
        self.outfile = outfile
    def run(self):
        f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
        f.write(self.infile)
        f.close()
        print('Finished background zip of:', self.infile)

background = AsyncZip('mydata.txt', 'myarchive.zip')
background.start()
print('The main program continues to run in foreground.')

background.join()    # Wait for the background task to finish
print('Main program waited until background was done.')

The principal challenge of multi-threaded applications is coordinating threads that share data or other resources. To that end, the threading module provides a number of synchronization primitives including locks, events, condition variables, and semaphores.

多线程应用程序最重要的挑战是在协调线程共享的数据和其它资源。最终,线程模块提供了几个基本的同步方式如锁、事件,条件变量和信号旗。

While those tools are powerful, minor design errors can result in problems that are difficult to reproduce. So, the preferred approach to task coordination is to concentrate all access to a resource in a single thread and then use the queue module to feed that thread with requests from other threads. Applications using Queue objects for inter-thread communication and coordination are easier to design, more readable, and more reliable.

尽管工具很强大,但微小的设计错误也可能造成难以挽回的故障。因此,更好的方法是将所有的资源访问集中到一个独立的线程中,然后使用 Queue 模块调度该线程相应其它线程的请求。应用程序使用 Queue 对象可以让内部线程通信和协调更容易设计,更可读,更可靠。

-11.5. 日志 Logging

The logging module offers a full featured and flexible logging system. At its simplest, log messages are sent to a file or to sys.stderr:

logging模块提供了完整和灵活的日志系统。它最简单的用法是记录信息并发送到一个文件或 ``sys.stderr`:

import logging
logging.debug('Debugging information')
logging.info('Informational message')
logging.warning('Warning:config file %s not found', 'server.conf')
logging.error('Error occurred')
logging.critical('Critical error -- shutting down')

This produces the following output:

这里是输出:

WARNING:root:Warning:config file server.conf not found
ERROR:root:Error occurred
CRITICAL:root:Critical error -- shutting down

By default, informational and debugging messages are suppressed and the output is sent to standard error. Other output options include routing messages through email, datagrams, sockets, or to an HTTP Server. New filters can select different routing based on message priority: DEBUG, INFO, WARNING, ERROR, and CRITICAL.

默认情况下捕获信息和调试消息并将输出发送到标准错误流。其它可选的路由信息方式通过email,数据报文,socket或者HTTP Server。基于消息属性,新的过滤器可以选择不同的路由:DEBUG, INFO,WARNING,ERROR 和CRITICAL。

The logging system can be configured directly from Python or can be loaded from a user editable configuration file for customized logging without altering the application.

日志系统可以直接在 Python 中定制,也可以不经过应用程序直接在一个用户可编辑的配置文件中加载。

-11.6. 弱引用 Weak References

Python does automatic memory management (reference counting for most objects and garbage collection to eliminate cycles). The memory is freed shortly after the last reference to it has been eliminated.

Python 自动进行内存管理(对大多数的对象进行引用计数和:term:`垃圾回收`以循环利用)在最后一个引用消失后,内存会很快释放。

This approach works fine for most applications but occasionally there is a need to track objects only as long as they are being used by something else. Unfortunately, just tracking them creates a reference that makes them permanent. The weakref module provides tools for tracking objects without creating a reference. When the object is no longer needed, it is automatically removed from a weakref table and a callback is triggered for weakref objects. Typical applications include caching objects that are expensive to create:

这个工作方式对大多数应用程序工作良好,但是偶尔会需要跟踪对象来做一些事。不幸的是,仅仅为跟踪它们创建引用也会使其长期存在。 weakref 模块提供了不用创建引用的跟踪对象工具,一旦对象不再存在,它自动从弱引用表上删除并触发回调。典型的应用包括捕获难以构造的对象:

>>> import weakref, gc
>>> class A:
...     def __init__(self, value):
...             self.value = value
...     def __repr__(self):
...             return str(self.value)
...
>>> a = A(10)                   # create a reference
>>> d = weakref.WeakValueDictionary()
>>> d['primary'] = a            # does not create a reference
>>> d['primary']                # fetch the object if it is still alive
10
>>> del a                       # remove the one reference
>>> gc.collect()                # run garbage collection right away
0
>>> d['primary']                # entry was automatically removed
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    d['primary']                # entry was automatically removed
  File "C:/python31/lib/weakref.py", line 46, in __getitem__
    o = self.data[key]()
KeyError: 'primary'

-11.7. 链表工具 Tools for Working with Lists

Many data structure needs can be met with the built-in list type. However, sometimes there is a need for alternative implementations with different performance trade-offs.

很多数据结构可能会用到内置链表类型。然而,有时可能需要不同性能代价的实现。

The array module provides an array() object that is like a list that stores only homogeneous data and stores it more compactly. The following example shows an array of numbers stored as two byte unsigned binary numbers (typecode "H") rather than the usual 16 bytes per entry for regular lists of Python int objects:

array模块提供了一个类似链表的array()对象,它仅仅是存储数据,更为紧凑。以下的示例演示了一个存储双字节无符号整数的数组(类型编码 ``“H”``)而非存储16字节 Python 整数对象的普通正规链表:

>>> from array import array
>>> a = array('H', [4000, 10, 700, 22222])
>>> sum(a)
26932
>>> a[1:3]
array('H', [10, 700])

The collections module provides a deque() object that is like a list with faster appends and pops from the left side but slower lookups in the middle. These objects are well suited for implementing queues and breadth first tree searches:

collections模块提供了类似链表的`deque() 对象,它从左边添加(append)和弹出(pop)更快,但是在内部查询更慢。这些对象更适用于队列实现和广度优先的树搜索:

>>> from collections import deque
>>> d = deque(["task1", "task2", "task3"])
>>> d.append("task4")
>>> print("Handling", d.popleft())
Handling task1

unsearched = deque([starting_node])
def breadth_first_search(unsearched):
    node = unsearched.popleft()
    for m in gen_moves(node):
        if is_goal(m):
            return m
        unsearched.append(m)

In addition to alternative list implementations, the library also offers other tools such as the bisect module with functions for manipulating sorted lists:

除了链表的替代实现,该库还提供了 bisect 这样的模块以操作存储链表:

>>> import bisect
>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>> bisect.insort(scores, (300, 'ruby'))
>>> scores
[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]

The heapq module provides functions for implementing heaps based on regular lists. The lowest valued entry is always kept at position zero. This is useful for applications which repeatedly access the smallest element but do not want to run a full list sort:

heapq提供了基于正规链表的堆实现。最小的值总是保持在0点。这在希望循环访问最小元素但是不想执行完整堆排序的时候非常有用:

>>> from heapq import heapify, heappop, heappush
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> heapify(data)                      # rearrange the list into heap order
>>> heappush(data, -5)                 # add a new entry
>>> [heappop(data) for i in range(3)]  # fetch the three smallest entries
[-5, 0, 1]

-11.8. 十进制浮点数算法 Decimal Floating Point Arithmetic

The decimal module offers a Decimal datatype for decimal floating point arithmetic. Compared to the built-in float implementation of binary floating point, the class is especially helpful for

  • financial applications and other uses which require exact decimal representation,
  • control over precision,
  • control over rounding to meet legal or regulatory requirements,
  • tracking of significant decimal places, or
  • applications where the user expects the results to match calculations done by hand.

decimal 模块提供了一个 Decimal 数据类型用于浮点数计算。相比内置的二进制浮点数实现 `float`,新类型特别适用于:

  • 金融应用和其它需要精确十进制表达的场合,
  • 控制精度,
  • 控制舍入以适应法律或者规定要求,
  • 确保十进制数位精度,
  • 或者用户希望用作数学计算的场合。

For example, calculating a 5% tax on a 70 cent phone charge gives different results in decimal floating point and binary floating point. The difference becomes significant if the results are rounded to the nearest cent:

例如,计算 70 分电话费的 5% 税计算,十进制浮点数和二进制浮点数计算结果的差别如下。如果在分值上舍入,这个差别就很重要了:

>>> from decimal import *
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
Decimal('0.74')
>>> round(.70 * 1.05, 2)
0.73

The Decimal result keeps a trailing zero, automatically inferring four place significance from multiplicands with two place significance. Decimal reproduces mathematics as done by hand and avoids issues that can arise when binary floating point cannot exactly represent decimal quantities.

Decimal 的结果总是保有结尾的0,自动从两位精度延伸到4位。Decimal重现了手工的数学运算,这就确保了二进制浮点数无法精确保有的数据精度。

Exact representation enables the Decimal class to perform modulo calculations and equality tests that are unsuitable for binary floating point:

高精度使 Decimal 可以执行二进制浮点数无法进行的模运算和等值测试:

>>> Decimal('1.00') % Decimal('.10')
Decimal('0.00')
>>> 1.00 % 0.10
0.09999999999999995

>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
True
>>> sum([0.1]*10) == 1.0
False

The decimal module provides arithmetic with as much precision as needed:

decimal提供了高精度算法:

>>> getcontext().prec = 36
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857')


————— 返回《 Python 官方教程 》目录 —————
—— 返回《 Python 手册 》总目录 ——