首页 > Python > Python 3 标准库实例教程 > 国际化和本地化

16.2. locale — 本地人文接口

目的:处理依赖于用户语言与位置的格式与值解析。

locale 模块是 Python 的国际化和本地化支持库的一部分。他提供了一种标准方式用于处理依赖于用户语言与位置的相关操作。举个例子:将数字格式化为货币、排序中的字符串比较、处理时间和日期。但这个模块并不包括翻译(参见 gettext 模块)和 Unicode 编码(参见 codecs 模块)相关的函数。

注解

改变地区设置会产生应用级别的影响,所以最佳实践是避免改变库中的值,而让应用自己设置一次。在本章的例子里,我们通过一个小程序多次改变地区设置,以便突出不同地区设置对程序的影响。更常见的情况是程序在启动时或收到一个 Web 请求时设置地区,之后便不再改变它。

本章会包含 locale 模块的一些高级函数。同时也会介绍一些更低级的操作,如 format_string() —— 格式化字符串和与管理应用地区设置有关的 resetlocale()

探索当前的地区设置

一般通过设置环境变量,让用户可以改变某个应用的地区设置。不同的平台环境变量也不相同,常见的有:LC_ALLLC_CTYPELANG或 LANGUAGE
然后程序通过调用 setlocale() 函数从环境变量中获取地区设置,而不是把地区设置硬编码在程序中。

locale_env.py

import locale
import os
import pprint

# 基于用户环境变量的默认值
locale.setlocale(locale.LC_ALL, '')

print('Environment settings:')
for env_name in ['LC_ALL', 'LC_CTYPE', 'LANG', 'LANGUAGE']:
    print('  {} = {}'.format(
        env_name, os.environ.get(env_name, ''))
    )

# 目前的地区设置是什么?
print('\nLocale from environment:', locale.getlocale())

template = """
Numeric formatting:

  Decimal point      : "{decimal_point}"
  Grouping positions : {grouping}
  Thousands separator: "{thousands_sep}"

Monetary formatting:

  International currency symbol   : "{int_curr_symbol!r}"
  Local currency symbol           : {currency_symbol!r}
  Symbol precedes positive value  : {p_cs_precedes}
  Symbol precedes negative value  : {n_cs_precedes}
  Decimal point                   : "{mon_decimal_point}"
  Digits in fractional values     : {frac_digits}
  Digits in fractional values,
                   international  : {int_frac_digits}
  Grouping positions              : {mon_grouping}
  Thousands separator             : "{mon_thousands_sep}"
  Positive sign                   : "{positive_sign}"
  Positive sign position          : {p_sign_posn}
  Negative sign                   : "{negative_sign}"
  Negative sign position          : {n_sign_posn}

"""

sign_positions = {
    0: 'Surrounded by parentheses',
    1: 'Before value and symbol',
    2: 'After value and symbol',
    3: 'Before value',
    4: 'After value',
    locale.CHAR_MAX: 'Unspecified',
}

info = {}
info.update(locale.localeconv())
info['p_sign_posn'] = sign_positions[info['p_sign_posn']]
info['n_sign_posn'] = sign_positions[info['n_sign_posn']]

print(template.format(**info))

localeconv() 方法会返回一个字典,其中包含了地区设置约定。字典中其它的名称与定义可以在标准库的文档中找到。

在运行 OS X 10.11.6 系统的 Mac 上,不设置任何环境变量时,运行程序会输出以下结果:

$ export LANG=; export LC_CTYPE=; python3 locale_env.py

Environment settings:
  LC_ALL =
  LC_CTYPE =
  LANG =
  LANGUAGE =

Locale from environment: (None, None)

Numeric formatting:

  Decimal point      : "."
  Grouping positions : []
  Thousands separator: ""

Monetary formatting:

  International currency symbol   : "''"
  Local currency symbol           : ''
  Symbol precedes positive value  : 127
  Symbol precedes negative value  : 127
  Decimal point                   : ""
  Digits in fractional values     : 127
  Digits in fractional values,
                   international  : 127
  Grouping positions              : []
  Thousands separator             : ""
  Positive sign                   : ""
  Positive sign position          : Unspecified
  Negative sign                   : ""
  Negative sign position          : Unspecified

提供不同的 LANG 环境变量变量参数,运行程序并观察地区设置和默认编码是如何改变的。

美国 (en_US):

$ LANG=en_US LC_CTYPE=en_US LC_ALL=en_US python3 locale_env.py

Environment settings:
  LC_ALL = en_US
  LC_CTYPE = en_US
  LANG = en_US
  LANGUAGE =

Locale from environment: ('en_US', 'ISO8859-1')

Numeric formatting:

  Decimal point      : "."
  Grouping positions : [3, 3, 0]
  Thousands separator: ","

Monetary formatting:

  International currency symbol   : "'USD '"
  Local currency symbol           : '$'
  Symbol precedes positive value  : 1
  Symbol precedes negative value  : 1
  Decimal point                   : "."
  Digits in fractional values     : 2
  Digits in fractional values,
                   international  : 2
  Grouping positions              : [3, 3, 0]
  Thousands separator             : ","
  Positive sign                   : ""
  Positive sign position          : Before value and symbol
  Negative sign                   : "-"
  Negative sign position          : Before value and symbol

法国 (fr_FR):

$ LANG=fr_FR LC_CTYPE=fr_FR LC_ALL=fr_FR python3 locale_env.py

Environment settings:
  LC_ALL = fr_FR
  LC_CTYPE = fr_FR
  LANG = fr_FR
  LANGUAGE =

Locale from environment: ('fr_FR', 'ISO8859-1')

Numeric formatting:

  Decimal point      : ","
  Grouping positions : [127]
  Thousands separator: ""

Monetary formatting:

  International currency symbol   : "'EUR '"
  Local currency symbol           : 'Eu'
  Symbol precedes positive value  : 0
  Symbol precedes negative value  : 0
  Decimal point                   : ","
  Digits in fractional values     : 2
  Digits in fractional values,
                   international  : 2
  Grouping positions              : [3, 3, 0]
  Thousands separator             : " "
  Positive sign                   : ""
  Positive sign position          : Before value and symbol
  Negative sign                   : "-"
  Negative sign position          : After value and symbol

西班牙 (es_ES):

$ LANG=es_ES LC_CTYPE=es_ES LC_ALL=es_ES python3 locale_env.py

Environment settings:
  LC_ALL = es_ES
  LC_CTYPE = es_ES
  LANG = es_ES
  LANGUAGE =

Locale from environment: ('es_ES', 'ISO8859-1')

Numeric formatting:

  Decimal point      : ","
  Grouping positions : [127]
  Thousands separator: ""

Monetary formatting:

  International currency symbol   : "'EUR '"
  Local currency symbol           : 'Eu'
  Symbol precedes positive value  : 0
  Symbol precedes negative value  : 0
  Decimal point                   : ","
  Digits in fractional values     : 2
  Digits in fractional values,
                   international  : 2
  Grouping positions              : [3, 3, 0]
  Thousands separator             : "."
  Positive sign                   : ""
  Positive sign position          : Before value and symbol
  Negative sign                   : "-"
  Negative sign position          : Before value and symbol

葡萄牙 (pt_PT):

$ LANG=pt_PT LC_CTYPE=pt_PT LC_ALL=pt_PT python3 locale_env.py

Environment settings:
  LC_ALL = pt_PT
  LC_CTYPE = pt_PT
  LANG = pt_PT
  LANGUAGE =

Locale from environment: ('pt_PT', 'ISO8859-1')

Numeric formatting:

  Decimal point      : ","
  Grouping positions : []
  Thousands separator: " "

Monetary formatting:

  International currency symbol   : "'EUR '"
  Local currency symbol           : 'Eu'
  Symbol precedes positive value  : 0
  Symbol precedes negative value  : 0
  Decimal point                   : "."
  Digits in fractional values     : 2
  Digits in fractional values,
                   international  : 2
  Grouping positions              : [3, 3, 0]
  Thousands separator             : "."
  Positive sign                   : ""
  Positive sign position          : Before value and symbol
  Negative sign                   : "-"
  Negative sign position          : Before value and symbol

波兰 (pl_PL):

$ LANG=pl_PL LC_CTYPE=pl_PL LC_ALL=pl_PL python3 locale_env.py

Environment settings:
  LC_ALL = pl_PL
  LC_CTYPE = pl_PL
  LANG = pl_PL
  LANGUAGE =

Locale from environment: ('pl_PL', 'ISO8859-2')

Numeric formatting:

  Decimal point      : ","
  Grouping positions : [3, 3, 0]
  Thousands separator: " "

Monetary formatting:

  International currency symbol   : "'PLN '"
  Local currency symbol           : 'zł'
  Symbol precedes positive value  : 1
  Symbol precedes negative value  : 1
  Decimal point                   : ","
  Digits in fractional values     : 2
  Digits in fractional values,
                   international  : 2
  Grouping positions              : [3, 3, 0]
  Thousands separator             : " "
  Positive sign                   : ""
  Positive sign position          : After value
  Negative sign                   : "-"
  Negative sign position          : After value

货币

上一个例子的输出显示,改变地区设置同时会改变货币符号和数字分隔符。
这个例子通过循环来改变地区设置,每个地区设置值下都打印一个正数和负数货币,输出结果以比较他们的差异。

locale_currency.py

import locale

sample_locales = [
    ('USA', 'en_US'),
    ('France', 'fr_FR'),
    ('Spain', 'es_ES'),
    ('Portugal', 'pt_PT'),
    ('Poland', 'pl_PL'),
]

for name, loc in sample_locales:
    locale.setlocale(locale.LC_ALL, loc)
    print('{:>10}: {:>10}  {:>10}'.format(
        name,
        locale.currency(1234.56),
        locale.currency(-1234.56),
    ))

程序以小表格的形式输出:

$ python3 locale_currency.py

       USA:   $1234.56   -$1234.56
    France: 1234,56 Eu  1234,56 Eu-
     Spain: 1234,56 Eu  -1234,56 Eu
  Portugal: 1234.56 Eu  -1234.56 Eu
    Poland: zł 1234,56  zł 1234,56-

格式化数字

地区设置改变时,与货币单位无关的数字格式发生了变化。用于将大数字分割成可读小块的分组字符也发生了变化。

locale_grouping.py

import locale

sample_locales = [
    ('USA', 'en_US'),
    ('France', 'fr_FR'),
    ('Spain', 'es_ES'),
    ('Portugal', 'pt_PT'),
    ('Poland', 'pl_PL'),
]

print('{:>10} {:>10} {:>15}'.format(
    'Locale', 'Integer', 'Float')
)
for name, loc in sample_locales:
    locale.setlocale(locale.LC_ALL, loc)

    print('{:>10}'.format(name), end=' ')
    print(locale.format('%10d', 123456, grouping=True), end=' ')
    print(locale.format('%15.2f', 123456.78, grouping=True))

要让格式化后的数字不带货币单位,应该使用 format() 而不是 currency() 函数。

$ python3 locale_grouping.py

    Locale    Integer           Float
       USA    123,456      123,456.78
    France     123456       123456,78
     Spain     123456       123456,78
  Portugal     123456       123456,78
    Poland    123 456      123 456,78

要将本地化的数字还原为地区无关的数字请使用 delocalize()

locale_delocalize.py

import locale

sample_locales = [
    ('USA', 'en_US'),
    ('France', 'fr_FR'),
    ('Spain', 'es_ES'),
    ('Portugal', 'pt_PT'),
    ('Poland', 'pl_PL'),
]

for name, loc in sample_locales:
    locale.setlocale(locale.LC_ALL, loc)
    localized = locale.format('%0.2f', 123456.78, grouping=True)
    delocalized = locale.delocalize(localized)
    print('{:>10}: {:>10}  {:>10}'.format(
        name,
        localized,
        delocalized,
    ))

删除分组符号,并将设置小数分隔符为  .

$ python3 locale_delocalize.py

       USA: 123,456.78   123456.78
    France:  123456,78   123456.78
     Spain:  123456,78   123456.78
  Portugal:  123456,78   123456.78
    Poland: 123 456,78   123456.78

解析数字

除了生成不同格式的输出外, locale 模块还可以帮助解析用户输入的字符串。模块内包含了  atoi() 和 atof() 函数,这两个函数可以根据地区数字格式约定,将字符串转化为整数或浮点数。

locale_atof.py

import locale

sample_data = [
    ('USA', 'en_US', '1,234.56'),
    ('France', 'fr_FR', '1234,56'),
    ('Spain', 'es_ES', '1234,56'),
    ('Portugal', 'pt_PT', '1234.56'),
    ('Poland', 'pl_PL', '1 234,56'),
]

for name, loc, a in sample_data:
    locale.setlocale(locale.LC_ALL, loc)
    print('{:>10}: {:>9} => {:f}'.format(
        name,
        a,
        locale.atof(a),
    ))

解析器识别出了分组符号与小数点。

$ python3 locale_atof.py

       USA:  1,234.56 => 1234.560000
    France:   1234,56 => 1234.560000
     Spain:   1234,56 => 1234.560000
  Portugal:   1234.56 => 1234.560000
    Poland:  1 234,56 => 1234.560000

时间与日期

时间与日期格式是本地化的另一个重要方面。

locale_date.py

import locale
import time

sample_locales = [
    ('USA', 'en_US'),
    ('France', 'fr_FR'),
    ('Spain', 'es_ES'),
    ('Portugal', 'pt_PT'),
    ('Poland', 'pl_PL'),
]

for name, loc in sample_locales:
    locale.setlocale(locale.LC_ALL, loc)
    format = locale.nl_langinfo(locale.D_T_FMT)
    print('{:>10}: {}'.format(name, time.strftime(format)))

以上是一个使用地区日期格式字符串打印当前日期的例子。

$ python3 locale_date.py

       USA: Sun Mar 18 16:20:59 2018
    France: Dim 18 mar 16:20:59 2018
     Spain: dom 18 mar 16:20:59 2018
  Portugal: Dom 18 Mar 16:20:59 2018
    Poland: ndz 18 mar 16:20:59 2018

参见