正则表达式函数(re模块)使用介绍

发布时间：2023-06-04 19:32:35

正则表达式(re模块)是一种用来匹配、搜索和处理文本的工具，它提供了一些函数用于操作文本中的模式和字符。在Python中，re模块是标准库中的一个模块，它提供了一组函数来处理正则表达式。本文将介绍re模块的一些常用函数的用法。

re.search(pattern, string, flags=0)

re.search函数用于在字符串中搜索匹配正则表达式的位置，返回一个匹配对象，如果匹配失败则返回None。它的参数包括：

pattern: 一个正则表达式模式，可以是一个字符串或者一个预编译的正则表达式对象。

string: 要匹配的字符串。

flags: 标志位，可以控制正则表达式的匹配方式，例如忽略大小写、多行匹配等。常用的标志位有re.IGNORECASE、re.MULTILINE等。

示例：

import re

pattern = 'world'
string = 'hello, world!'

match = re.search(pattern, string)

if match:
    print("Match found: ", match.group())
else:
    print("Match not found")

# Output: Match found:  world

re.match(pattern, string, flags=0)

re.match函数用于从字符串的开始位置匹配正则表达式，返回一个匹配对象，如果匹配失败则返回None。它的参数和re.search函数的参数类似。

示例：

import re

pattern = 'hello'
string = 'hello, world!'

match = re.match(pattern, string)

if match:
    print("Match found: ", match.group())
else:
    print("Match not found")

# Output: Match found:  hello

re.findall(pattern, string, flags=0)

re.findall函数用于查找字符串中所有与正则表达式匹配的子串，返回一个列表。它的参数和re.search函数的参数类似。

示例：

import re

pattern = '\d+'
string = 'hello, 12345 world!'

matches = re.findall(pattern, string)

print(matches)

# Output: ['12345']

re.sub(pattern, repl, string, count=0, flags=0)

re.sub函数用于用指定的字符串替换字符串中与正则表达式匹配的子串，返回一个新的字符串。它的参数包括：

pattern: 一个正则表达式模式，可以是一个字符串或者一个预编译的正则表达式对象。

repl: 用来替换匹配的子串的字符串。

string: 要匹配的字符串。

count: 可选参数，指定最多替换次数，如果省略则替换所有匹配的子串。

flags: 标志位，可以控制正则表达式的匹配方式，例如忽略大小写、多行匹配等。常用的标志位有re.IGNORECASE、re.MULTILINE等。

示例：

import re

pattern = '\d+'
string = 'hello, 12345 world!'

new_string = re.sub(pattern, '', string)

print(new_string)

# Output: hello,  world!

re.split(pattern, string, maxsplit=0, flags=0)

re.split函数用于按照正则表达式匹配的位置将字符串分割成多个子串，返回一个列表。它的参数和re.sub函数的参数类似，只是repl参数被省略了。maxsplit参数指定最多分割次数，如果省略则分割所有匹配的子串。

示例：

import re

pattern = '\W+'
string = 'hello, 12345 world!'

parts = re.split(pattern, string)

print(parts)

# Output: ['hello', '12345', 'world', '']

以上是re模块的一些常用函数，它们提供了一些基本的正则表达式操作，可以用于字符串匹配、替换和分割等常见任务。在实际开发中，还可以使用更高级的正则表达式技巧，例如捕获组、零宽断言等。通过熟练掌握re模块，可以大大提高文本处理的效率。