Regular Expressions Python - unicode to plain text -
anyone know simple way following?
given unicode string containing 'usa' in chinese (美国), japanese (米国), or korean (미국), write function returns plain python byte string in international version of 'usa' translated 'usa' in english. example: translate_usa(u'美国 country.') should return 'usa country.'
you chain str.replace
or use regex matches of 3 strings.
>>> import re >>> usa = (u'美国', u'米国', u'미êµ') >>> re.sub('|'.join(usa), 'usa', u'美国 country.') u'usa country.'
Comments
Post a Comment