Python字符串的常用操作方法

字符串的创建与基本特性

在Python中，字符串是一种基本的数据类型，用于表示文本数据。创建字符串非常简单，只需将文本内容放在单引号（'）、双引号（"）或三引号（''' 或 """）中即可。

s1 = 'Hello, World!'
s2 = "Hello, World!"
s3 = '''Hello, 
World!'''
s4 = """Hello, 
World!"""

字符串具有不可变（immutable）的特性，这意味着一旦创建了一个字符串对象，就不能直接修改其内容。例如：

s = 'Hello'
try:
    s[0] = 'h'
except TypeError as e:
    print(f"发生错误: {e}")

上述代码尝试修改字符串 s 的第一个字符，会抛出 TypeError 异常，因为字符串是不可变的。

字符串的索引与切片

索引
- 字符串中的每个字符都有一个对应的索引值。索引从0开始，表示字符串的第一个字符，依次递增。同时，Python也支持负索引， -1 表示字符串的最后一个字符， -2 表示倒数第二个字符，以此类推。
```
s = 'Python'
print(s[0])  # 输出 'P'
print(s[2])  # 输出 't'
print(s[-1]) # 输出 'n'
print(s[-3]) # 输出 'h'
```

切片

切片操作允许从字符串中提取子字符串。切片的语法为 s[start:stop:step]，其中 start 是起始索引（包括该索引位置的字符），stop 是结束索引（不包括该索引位置的字符），step 是步长，默认为1。
基本切片

s = 'Python'
print(s[1:4])  # 输出 'yth'，从索引1开始到索引4（不包括4）

省略起始索引

s = 'Python'
print(s[:3])  # 输出 'Pyt'，省略起始索引时，默认从0开始

省略结束索引

s = 'Python'
print(s[2:])  # 输出 'thon'，省略结束索引时，默认到字符串末尾

省略起始和结束索引

s = 'Python'
print(s[:])  # 输出 'Python'，相当于复制整个字符串

指定步长

s = 'Python'
print(s[0:5:2])  # 输出 'Pto'，步长为2
print(s[::-1])  # 输出 'nohtyP'，步长为 -1 时，实现字符串反转

字符串的拼接

使用 + 运算符
- 可以使用 + 运算符将两个或多个字符串拼接在一起。
```
s1 = 'Hello'
s2 = 'World'
s3 = s1 + ', ' + s2
print(s3)  # 输出 'Hello, World'
```
但是，使用 + 运算符拼接大量字符串时效率较低，因为每次拼接都会创建一个新的字符串对象。
使用 join() 方法
- join() 方法是一种更高效的字符串拼接方式，特别是在拼接大量字符串时。它的语法是 separator.join(iterable)，其中 separator 是要插入在每个元素之间的字符串，iterable 是包含字符串的可迭代对象（如列表、元组等）。
```
parts = ['Hello', 'World']
s = ', '.join(parts)
print(s)  # 输出 'Hello, World'
```
这里 ', ' 作为分隔符，将列表 parts 中的字符串连接起来。

字符串的复制

使用 * 运算符
- 可以使用 * 运算符将字符串复制指定的次数。
```
s = 'Hi'
new_s = s * 3
print(new_s)  # 输出 'HiHiHi'
```

字符串的查找与搜索

find() 方法
- find() 方法用于在字符串中查找子字符串，并返回子字符串第一次出现的索引位置。如果找不到，则返回 -1。其语法为 s.find(sub[, start[, end]])，其中 sub 是要查找的子字符串，start 和 end 是可选的起始和结束索引，用于限定查找范围。
```
s = 'Hello, World!'
print(s.find('World'))  # 输出 7
print(s.find('Python')) # 输出 -1
print(s.find('l', 3))   # 从索引3开始查找 'l'，输出 3
```
index() 方法
- index() 方法与 find() 方法类似，也是查找子字符串第一次出现的索引位置。但不同的是，如果找不到子字符串，index() 方法会抛出 ValueError 异常。其语法为 s.index(sub[, start[, end]])。
```
s = 'Hello, World!'
print(s.index('World'))  # 输出 7
try:
    print(s.index('Python'))
except ValueError as e:
    print(f"发生错误: {e}")
```
rfind() 和 rindex() 方法
- rfind() 和 rindex() 方法与 find() 和 index() 类似，只是它们是从字符串的末尾开始查找。rfind() 如果找不到子字符串返回 -1，rindex() 找不到时抛出 ValueError 异常。
```
s = 'Hello, World! Hello'
print(s.rfind('Hello'))  # 输出 13
print(s.rindex('Hello')) # 输出 13
```
count() 方法
- count() 方法用于统计子字符串在字符串中出现的次数。其语法为 s.count(sub[, start[, end]])。
```
s = 'Hello, World! Hello'
print(s.count('Hello'))  # 输出 2
print(s.count('l'))      # 输出 3
```

字符串的替换

replace() 方法
- replace() 方法用于将字符串中的指定子字符串替换为另一个字符串。其语法为 s.replace(old, new[, count])，其中 old 是要被替换的子字符串，new 是替换后的字符串，count 是可选参数，指定最多替换的次数，默认全部替换。
```
s = 'Hello, World!'
new_s = s.replace('World', 'Python')
print(new_s)  # 输出 'Hello, Python!'
s2 = 'aaaa'
new_s2 = s2.replace('a', 'b', 2)
print(new_s2) # 输出 'bb aa'
```

字符串的分割与合并

split() 方法
- split() 方法用于根据指定的分隔符将字符串分割成一个列表。其语法为 s.split([sep[, maxsplit]])，其中 sep 是分隔符，默认为空白字符（空格、制表符、换行符等），maxsplit 是可选参数，指定最多分割的次数。
```
s = 'Hello, World!'
parts = s.split(', ')
print(parts)  # 输出 ['Hello', 'World!']
s2 = 'a b c d'
parts2 = s2.split(' ', 2)
print(parts2) # 输出 ['a', 'b', 'c d']
```
rsplit() 方法
- rsplit() 方法与 split() 方法类似，只是它从字符串的末尾开始分割。语法为 s.rsplit([sep[, maxsplit]])。
```
s = 'a,b,c,d'
parts = s.rsplit(',', 2)
print(parts)  # 输出 ['a,b', 'c', 'd']
```

splitlines() 方法

splitlines() 方法用于根据行分隔符（\n, \r, \r\n 等）将字符串分割成一个列表。语法为 s.splitlines([keepends])，keepends 是可选参数，为 True 时保留行分隔符，默认为 False。

s = 'line1\nline2\rline3\r\nline4'
lines = s.splitlines()
print(lines)  # 输出 ['line1', 'line2', 'line3', 'line4']
lines_with_ends = s.splitlines(True)
print(lines_with_ends) # 输出 ['line1\n', 'line2\r', 'line3\r\n', 'line4']

合并字符串（join() 方法回顾）
- 前面已经提到 join() 方法用于将可迭代对象中的字符串连接起来。例如：
```
parts = ['Hello', 'World']
s = ' '.join(parts)
print(s)  # 输出 'Hello World'
```

字符串的大小写转换

upper() 方法
- upper() 方法用于将字符串中的所有字符转换为大写。
```
s = 'hello'
new_s = s.upper()
print(new_s)  # 输出 'HELLO'
```
lower() 方法
- lower() 方法用于将字符串中的所有字符转换为小写。
```
s = 'HELLO'
new_s = s.lower()
print(new_s)  # 输出 'hello'
```
title() 方法
- title() 方法用于将字符串中每个单词的首字母转换为大写，其余字母转换为小写。
```
s = 'hello world'
new_s = s.title()
print(new_s)  # 输出 'Hello World'
```
capitalize() 方法
- capitalize() 方法用于将字符串的第一个字符转换为大写，其余字符转换为小写。
```
s = 'hello world'
new_s = s.capitalize()
print(new_s)  # 输出 'Hello world'
```

字符串的去除空白字符

strip() 方法
- strip() 方法用于去除字符串开头和结尾的空白字符（空格、制表符、换行符等）。
```
s = '   Hello, World!   \n'
new_s = s.strip()
print(new_s)  # 输出 'Hello, World!'
```

lstrip() 方法

lstrip() 方法用于去除字符串开头的空白字符。

s = '   Hello, World!'
new_s = s.lstrip()
print(new_s)  # 输出 'Hello, World!'

rstrip() 方法

rstrip() 方法用于去除字符串结尾的空白字符。

s = 'Hello, World!   \n'
new_s = s.rstrip()
print(new_s)  # 输出 'Hello, World!'

字符串的格式化

旧风格格式化（% 格式化）

使用 % 运算符进行字符串格式化。常见的格式化字符有 %s（字符串）、%d（整数）、%f（浮点数）等。

name = 'Alice'
age = 30
s = 'My name is %s and I am %d years old.' % (name, age)
print(s)  # 输出 'My name is Alice and I am 30 years old.'

对于浮点数，还可以指定精度，例如 %.2f 表示保留两位小数。

num = 3.14159
s = 'The value of pi is approximately %.2f' % num
print(s)  # 输出 'The value of pi is approximately 3.14'

format() 方法

format() 方法提供了更灵活和强大的字符串格式化功能。
基本使用

name = 'Bob'
age = 25
s = 'My name is {} and I am {} years old.'.format(name, age)
print(s)  # 输出 'My name is Bob and I am 25 years old.'

通过位置索引

s = '{0} is {1} years old. {0} likes programming.'.format('Charlie', 22)
print(s)  # 输出 'Charlie is 22 years old. Charlie likes programming.'

通过关键字参数

s = '{name} is {age} years old.'.format(name='David', age = 28)
print(s)  # 输出 'David is 28 years old.'

格式化数字

num = 1234.5678
s = 'The number is {:.2f}'.format(num)
print(s)  # 输出 'The number is 1234.57'

f - 字符串格式化（Python 3.6+）

f - 字符串是一种简洁且高效的字符串格式化方式。在字符串前面加上 f 或 F，然后在字符串中使用花括号 {} 来包含表达式。

name = 'Eve'
age = 32
s = f'My name is {name} and I am {age} years old.'
print(s)  # 输出 'My name is Eve and I am 32 years old.'

格式化表达式

num1 = 5
num2 = 3
s = f'The sum of {num1} and {num2} is {num1 + num2}.'
print(s)  # 输出 'The sum of 5 and 3 is 8.'

字符串的判断方法

startswith() 方法
- startswith() 方法用于判断字符串是否以指定的子字符串开头。语法为 s.startswith(prefix[, start[, end]])。
```
s = 'Hello, World!'
print(s.startswith('Hello'))  # 输出 True
print(s.startswith('World', 7))# 从索引7开始判断，输出 True
```
endswith() 方法
- endswith() 方法用于判断字符串是否以指定的子字符串结尾。语法为 s.endswith(suffix[, start[, end]])。
```
s = 'Hello, World!'
print(s.endswith('World!'))  # 输出 True
print(s.endswith('Hello', 0, 5)) # 从索引0到5判断，输出 True
```

isalpha() 方法

isalpha() 方法用于判断字符串中的所有字符是否都是字母。

s1 = 'Hello'
s2 = 'Hello123'
print(s1.isalpha())  # 输出 True
print(s2.isalpha())  # 输出 False

isdigit() 方法

isdigit() 方法用于判断字符串中的所有字符是否都是数字（0 - 9）。

s1 = '123'
s2 = 'abc'
s3 = '12a'
print(s1.isdigit())  # 输出 True
print(s2.isdigit())  # 输出 False
print(s3.isdigit())  # 输出 False

isalnum() 方法

isalnum() 方法用于判断字符串中的所有字符是否都是字母或数字。

s1 = 'Hello123'
s2 = 'Hello!'
print(s1.isalnum())  # 输出 True
print(s2.isalnum())  # 输出 False

isspace() 方法

isspace() 方法用于判断字符串中的所有字符是否都是空白字符。

s1 = '   '
s2 = 'Hello'
print(s1.isspace())  # 输出 True
print(s2.isspace())  # 输出 False

isupper() 方法

isupper() 方法用于判断字符串中的所有字母是否都是大写。

s1 = 'HELLO'
s2 = 'Hello'
print(s1.isupper())  # 输出 True
print(s2.isupper())  # 输出 False

islower() 方法

islower() 方法用于判断字符串中的所有字母是否都是小写。

s1 = 'hello'
s2 = 'Hello'
print(s1.islower())  # 输出 True
print(s2.islower())  # 输出 False

通过掌握这些Python字符串的常用操作方法，开发者能够更高效地处理文本数据，无论是在数据清洗、文本分析还是日常编程任务中，都能更加得心应手。这些方法不仅是Python字符串处理的基础，也是构建复杂应用程序和算法的重要工具。