使用Python的etree()库解析XML文件并提取相关数据的代码示例
发布时间:2023-12-11 16:41:34
使用Python的etree()库解析XML文件并提取相关数据的代码示例:
# 导入所需库
from xml.etree import ElementTree as ET
# 加载XML文件
tree = ET.parse('data.xml')
root = tree.getroot()
# 遍历XML文件中的元素
for element in root.iter():
# 提取元素的标签名和文本内容
print(element.tag, element.text)
# 提取特定标签的元素
# 获取所有country元素
countries = root.findall('.//country')
# 遍历country元素列表,提取相关数据
for country in countries:
# 提取country元素的属性值
country_name = country.get('name')
# 提取country元素下的population元素的文本内容
population = country.find('population').text
# 提取country元素下的description元素的文本内容
description = country.find('description').text
# 打印提取的数据
print('Country name:', country_name)
print('Population:', population)
print('Description:', description)
print('
')
下面是一个例子来解析一个包含国家信息的XML文件。假设我们有一个名为data.xml的XML文件,其内容如下:
<countries>
<country name="USA">
<population>328.2 million</population>
<description>The United States of America (USA), commonly known as the United States (U.S. or US)...</description>
</country>
<country name="China">
<population>1.4 billion</population>
<description>China, officially the People's Republic of China (PRC), is a country in East Asia...</description>
</country>
<country name="India">
<population>1.3 billion</population>
<description>India, officially the Republic of India, is a country in South Asia...</description>
</country>
</countries>
使用以上代码示例,我们可以从XML文件中提取出每个国家的名称、人口数量和描述信息,并打印输出。
输出结果如下:
Country name: USA Population: 328.2 million Description: The United States of America (USA), commonly known as the United States (U.S. or US)... Country name: China Population: 1.4 billion Description: China, officially the People's Republic of China (PRC), is a country in East Asia... Country name: India Population: 1.3 billion Description: India, officially the Republic of India, is a country in South Asia...
这个例子简单地演示了如何使用Python的etree()模块来解析XML文件,并提取其中的相关数据。
