你好朋友有点问题.我只需要提取文本“任何人”的单词.
我尝试使用strtok(),strstr()检索单词.一些正则表达式,但只设法提取一些单词.
由于可以伴随单词的字符和符号的数量,问题是复杂的.
必须提取单词的示例文本.这是一个示例文本:
Main article: our 46,000 required, !but (1947-) mail@ March 8, Gutenberg's 34-DE 'a' 3,1415 Us: @unknown n go or and (r) The 509th "composite" and; C-54 #dog v4.0 ¿as is done? ¿article... agriculture? x ¿cat? now! Hi!! (87 meters).
Sample text, for testing.
提取文本的结果应该是:
Main article our required but March Gutenberg's a go or and The composite and dog as is done article agriculture cat now Hi meters
Sample text for testing
我写的第一个函数是为了方便工作
function PreText($text){
$text = str_replace("\n", ".", $text);
$text = str_replace("\r", ".", $text);
$text = str_replace("'", "", $text);
$text = str_replace("?", "", $text);
$text = str_replace("¿", "", $text);
$text = str_replace("(", "", $text);
$text = str_replace(")", "", $text);
$text = str_replace('"', "", $text);
$text = str_replace(';', "", $text);
$text = str_replace('!', "", $text);
$text = str_replace('
$text = str_replace('>', "", $text);
$text = str_replace('#', "", $text);
$text = str_replace(",", "", $text);
$text = str_replace(".c", "", $text);
$text = str_replace(".C", "", $text);
return $text;
}
分割功能:
function SplitWords($text){
$words = explode(" ", $text);
$ContWords = count($words);
for ($i = 0; $i < $ContWords; $i++){
if (ctype_alpha($words[$i])) {
$NewText .= $words[$i].", ";
}
}
return $NewText;
}
该程序:
include_once ('functions.php');
$text = "Main article: our 46,000 ...";
$text = PreText($text);
$text = SplitWords($text);
echo $text;
?>
是代码还有很长的路要走.感谢您的帮助.
解决方法:
如果我理解正确,您要删除字符串中的所有非字母.我会用preg_replace
$text = "Main article: our 46,000...";
$text = preg_replace("/[^a-zA-Z' ]/","",$text);
这应该删除所有不是字母,撇号或空格的东西.
标签:php,regex
来源: https://codeday.me/bug/0609/1206506.html