1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > python如何去除文本标点符号_python中如何去除标点符号

python如何去除文本标点符号_python中如何去除标点符号

时间:2020-10-12 18:26:12

相关推荐

python如何去除文本标点符号_python中如何去除标点符号

python中如何去除标点符号,写法,方法,字符,字母,都是

python中如何去除标点符号

易采站长站,站长之家为您整理了python中如何去除标点符号的相关内容。

Python去掉标点符号的方法如下:

方法一:

str.isalnum:

S.isalnum() -> bool

返回值:如果string至少有一个字符并且所有字符都是字母或数字则返回True,否则返回False。

实例:>>> string = "Special $#! characters spaces 888323">>> ''.join(e for e in string if e.isalnum())'Specialcharactersspaces888323'

只能识别字母和数字,杀伤力大,会把中文、空格之类的也干掉

方法二:

string.punctuationimport re, strings ="string. With. Punctuation?" # Sample string # 写法一:out = s.translate(string.maketrans("",""), string.punctuation)# 写法二:out = s.translate(None, string.punctuation)# 写法三:exclude = set(string.punctuation)out = ''.join(ch for ch in s if ch not in exclude)# 写法四:>>> for c in string.punctuation:s = s.replace(c,"")>>> s'string With Punctuation'# 写法五:out = re.sub('[%s]' % re.escape(string.punctuation), '', s)## re.escape:对字符串中所有可能被解释为正则运算符的字符进行转义# 写法六:# string.punctuation 只包括 ascii 格式; 想要一个包含更广(但是更慢)的方法是使用: unicodedata module :from unicodedata import categorys = u'String — with - «Punctuation »...'out = re.sub('[%s]' % re.escape(string.punctuation), '', s)print 'Stripped', out# 输出:u'Stripped String \u with \xabPunctuation \xbb'out = ''.join(ch for ch in s if category(ch)[0] != 'P')print 'Stripped', out# 输出:u'Stripped String with Punctuation '# For Python 3 str or Python 2 unicode values, str.translate() only takes a dictionary; codepoints (integers) are looked up in that mapping and anything mapped to None is removed.# To remove (some?) punctuation then, use:import stringremove_punct_map = dict.fromkeys(map(ord, string.punctuation))s.translate(remove_punct_map)# Your method doesn't work in Python 3, as the translate method doesn't accept the second argument any more. import unicodedataimport systbl = dict.fromkeys(i for i in range(sys.maxunicode) if unicodedata.category(chr(i)).startswith('P'))def remove_punctuation(text):return text.translate(tbl)

方法三:

re

例:import res ="string. With. Punctuation?"s = re.sub(r'[^\w\s]','',s)

测试:import re, string, timeits ="string. With. Punctuation"exclude = set(string.punctuation)table = string.maketrans("","")regex = pile('[%s]' % re.escape(string.punctuation))def test_set(s):return ''.join(ch for ch in s if ch not in exclude)def test_re(s): return regex.sub('', s)def test_trans(s):return s.translate(table, string.punctuation)def test_repl(s):for c in string.punctuation:s=s.replace(c,"")return sprint"sets :",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)print"regex :",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)print"translate :",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)print"replace :",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)out_put:# sets : 19.8566138744# regex : 6.86155414581# translate : 2.12455511093# replace : 28.4436721802

更多Python相关技术文章,请访问Python教程栏目进行学习!以上就是关于对python中如何去除标点符号的详细介绍。欢迎大家对python中如何去除标点符号内容提出宝贵意见

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。