python Programming Glossary: unicodedata.category

Python unicode regular expression matching failing with some unicode characters -bug or mistake?

http://stackoverflow.com/questions/12746458/python-unicode-regular-expression-matching-failing-with-some-unicode-characters

re_ assert re_.search ^ w word flags re_.UNICODE print unicodedata.category cp for cp in word print .join ch for ch in regex.findall X word..

How to implement Unicode string matching by folding in python

http://stackoverflow.com/questions/1410308/how-to-implement-unicode-string-matching-by-folding-in-python

c for c in unicodedata.normalize 'NFD' unicode s if unicodedata.category c 'Mn' strip_accents u' stblocket' 'Ostblocket' share improve..

Playing around with Devanagari characters

http://stackoverflow.com/questions/6805311/playing-around-with-devanagari-characters

by looking up the Unicode category for each code point map unicodedata.category a 'Lo' 'Mc' 'Lo' 'Mn' 'Lo' 'Lo' 'Zs' 'Lo' 'Mn' 'Lo' 'Mc' 'Zs'.. None virama u' N DEVANAGARI SIGN VIRAMA ' for c in s cat unicodedata.category c 0 if cat 'M' or cat 'L' and last virama cluster c else if..

removing accent and special characters [duplicate]

http://stackoverflow.com/questions/8694815/removing-accent-and-special-characters

''.join x for x in unicodedata.normalize 'NFKD' data if unicodedata.category x 0 'L' .lower Is there any better way to do this python diacritics..

Stripping non printable characters from a string in python

http://stackoverflow.com/questions/92438/stripping-non-printable-characters-from-a-string-in-python

module is quite helpful for this especially the unicodedata.category function. See Unicode Character Database for descriptions of.. 0x110000 control_chars ''.join c for c in all_chars if unicodedata.category c 'Cc' # or equivalently and much more efficiently control_chars..