使用Python的etree()库解析XML文件并提取相关数据的代码示例

发布时间：2023-12-11 16:41:34

使用Python的etree()库解析XML文件并提取相关数据的代码示例：

# 导入所需库
from xml.etree import ElementTree as ET

# 加载XML文件
tree = ET.parse('data.xml')
root = tree.getroot()

# 遍历XML文件中的元素
for element in root.iter():
    # 提取元素的标签名和文本内容
    print(element.tag, element.text)

# 提取特定标签的元素
# 获取所有country元素
countries = root.findall('.//country')

# 遍历country元素列表，提取相关数据
for country in countries:
    # 提取country元素的属性值
    country_name = country.get('name')
    
    # 提取country元素下的population元素的文本内容
    population = country.find('population').text
    
    # 提取country元素下的description元素的文本内容
    description = country.find('description').text
    
    # 打印提取的数据
    print('Country name:', country_name)
    print('Population:', population)
    print('Description:', description)
    print('
')

下面是一个例子来解析一个包含国家信息的XML文件。假设我们有一个名为data.xml的XML文件，其内容如下：

<countries>
    <country name="USA">
        <population>328.2 million</population>
        <description>The United States of America (USA), commonly known as the United States (U.S. or US)...</description>
    </country>
    <country name="China">
        <population>1.4 billion</population>
        <description>China, officially the People's Republic of China (PRC), is a country in East Asia...</description>
    </country>
    <country name="India">
        <population>1.3 billion</population>
        <description>India, officially the Republic of India, is a country in South Asia...</description>
    </country>
</countries>

使用以上代码示例，我们可以从XML文件中提取出每个国家的名称、人口数量和描述信息，并打印输出。

输出结果如下：

Country name: USA
Population: 328.2 million
Description: The United States of America (USA), commonly known as the United States (U.S. or US)... 


Country name: China
Population: 1.4 billion
Description: China, officially the People's Republic of China (PRC), is a country in East Asia... 


Country name: India
Population: 1.3 billion
Description: India, officially the Republic of India, is a country in South Asia...

这个例子简单地演示了如何使用Python的etree()模块来解析XML文件，并提取其中的相关数据。