Quickstart#

Base model#

To declare an xml serializable / deserializable model inherit it from pydantic_xml.BaseXmlModel base class. It collects the data binding meta-information and generates an xml serializer for the model.

To serialize the object into an xml string use pydantic_xml.BaseXmlModel.to_xml() method or pydantic_xml.BaseXmlModel.from_xml() to deserialize it. For more information see XML serialization.

Data binding#

A model field can be bound to an xml attribute, element or text. Binding type is derived using the following rules:

1. Primitives#

field of a primitive type (int, float, str, datetime, …) is bound to the element text by default:

<Company>
    Space Exploration Technologies Corp.
</Company>
class Company(BaseXmlModel):
    description: constr(strip_whitespace=True)

To alter the default behaviour the field has to be marked as pydantic_xml.attr():

<Company trade-name="SpaceX" type="Private"/>
class Company(BaseXmlModel):
    trade_name: str = attr(name='trade-name')
    type: str = attr()

or pydantic_xml.element():

<company>
    <founded>2002-03-14</founded>
    <web-size>https://www.spacex.com</web-size>
</company>
class Company(BaseXmlModel, tag='company'):
    founded: dt.date = element()
    website: HttpUrl = element(tag='web-size')

For more information see text, attributes and elements bindings declarations.

2. Sub-models#

field of a model type (inherited from BaseXmlModel) is bound to a sub-element:

<company>
    <headquarters>
        <country>US</country>
        <state>California</state>
        <city>Hawthorne</city>
    </headquarters>
</company>
class Headquarters(BaseXmlModel, tag='headquarters'):
    country: str = element()
    state: str = element()
    city: str = element()


class Company(BaseXmlModel, tag='company'):
    headquarters: Headquarters

For more information see model types.

3. Mapping#

field of a mapping type (Dict[str, str], Mapping[str, int], TypedDict …) is bound to local element attributes (by default):

<Company trade-name="SpaceX" type="Private"/>
class Company(BaseXmlModel):
    properties: Dict[str, str]

or to sub-element attributes if the field is marked as pydantic_xml.element():

<Company>
    <Founder name="Elon" surname="Musk"/>
</Company>
class Company(BaseXmlModel):
    founder: Dict[str, str] = element(tag='Founder')

For more information see mappings.

4. Primitive collection#

field of a primitive collection type (List[str], Set[int], Tuple[float, float] …) is bound to sub-elements texts:

<Company>
    <Product>Several launch vehicles</Product>
    <Product>Starlink</Product>
    <Product>Starship</Product>
</Company>
class Company(BaseXmlModel):
    products: List[str] = element(tag='Product')

For more information see primitive heterogeneous collections.

5. Model collection#

field of a model collection type (List[BaseXmlModel], Tuple[BaseXmlModel, ...]) is bound to sub-elements:

<Company>
    <social type="linkedin">https://www.linkedin.com/company/spacex</social>
    <social type="twitter">https://twitter.com/spacex</social>
    <social type="youtube">https://www.youtube.com/spacex</social>

    <product status="running" launched="2013">Several launch vehicles</product>
    <product status="running" launched="2019">Starlink</product>
    <product status="development">Starship</product>
</Company>
class Social(BaseXmlModel):
    type: str = attr()
    url: str


class Product(BaseXmlModel):
    status: Literal['running', 'development'] = attr()
    launched: Optional[int] = attr(default=None)
    title: str


class Company(BaseXmlModel):
    socials: Tuple[Social, ...] = element(tag='social')
    products: List[Product] = element(tag='product')

For more information see primitive homogeneous collections and primitive heterogeneous collections.

6. Wrapper#

wrapped field (marked as pydantic_xml.wrapped()) is bound to a sub-element located at the provided path. Then depending on the field type the rules are the same as described above:

<Company>
    <Info>
        <Headquarters>
            <Location>
                <City>Hawthorne</City>
                <Country>US</Country>
            </Location>
        </Headquarters>
    </Info>
</Company>
class Company(BaseXmlModel):
    city: str = wrapped(
        'Info/Headquarters/Location',
        element(tag='City'),
    )
    country: str = wrapped(
        'Info/Headquarters/Location/Country',
    )

For more information see wrapped entities

Example#

The following example illustrates all the previously described rules combined with some pydantic features:

doc.xml:

<Company trade-name="SpaceX" type="Private" xmlns:pd="http://www.company.com/prod">
    <Founder name="Elon" surname="Musk"/>
    <Founded>2002-03-14</Founded>
    <Employees>12000</Employees>
    <WebSite>https://www.spacex.com</WebSite>

    <Industries>
        <Industry>space</Industry>
        <Industry>communications</Industry>
    </Industries>

    <key-people>
        <person position="CEO" name="Elon Musk"/>
        <person position="CTO" name="Elon Musk"/>
        <person position="COO" name="Gwynne Shotwell"/>
    </key-people>

    <hq:headquarters xmlns:hq="http://www.company.com/hq">
        <hq:country>US</hq:country>
        <hq:state>California</hq:state>
        <hq:city>Hawthorne</hq:city>
    </hq:headquarters>

    <co:contacts xmlns:co="http://www.company.com/contact">
        <co:socials>
            <co:social co:type="linkedin">https://www.linkedin.com/company/spacex</co:social>
            <co:social co:type="twitter">https://twitter.com/spacex</co:social>
            <co:social co:type="youtube">https://www.youtube.com/spacex</co:social>
        </co:socials>
    </co:contacts>

    <pd:product pd:status="running" pd:launched="2013">Several launch vehicles</pd:product>
    <pd:product pd:status="running" pd:launched="2019">Starlink</pd:product>
    <pd:product pd:status="development">Starship</pd:product>
</Company>

model.py:

import pathlib
from datetime import date
from enum import Enum
from typing import Dict, List, Literal, Optional, Set, Tuple

import pydantic as pd
from pydantic import HttpUrl, conint

from pydantic_xml import BaseXmlModel, RootXmlModel, attr, element, wrapped

NSMAP = {
    'co': 'http://www.company.com/contact',
    'hq': 'http://www.company.com/hq',
    'pd': 'http://www.company.com/prod',
}


class Headquarters(BaseXmlModel, ns='hq', nsmap=NSMAP):
    country: str = element()
    state: str = element()
    city: str = element()

    @pd.field_validator('country')
    def validate_country(cls, value: str) -> str:
        if len(value) > 2:
            raise ValueError('country must be of 2 characters')
        return value


class Industries(RootXmlModel):
    root: Set[str] = element(tag='Industry')


class Social(BaseXmlModel, ns_attrs=True, ns='co', nsmap=NSMAP):
    type: str = attr()
    url: HttpUrl


class Product(BaseXmlModel, ns_attrs=True, ns='pd', nsmap=NSMAP):
    status: Literal['running', 'development'] = attr()
    launched: Optional[int] = attr(default=None)
    title: str


class Person(BaseXmlModel):
    name: str = attr()


class CEO(Person):
    position: Literal['CEO'] = attr()


class CTO(Person):
    position: Literal['CTO'] = attr()


class COO(Person):
    position: Literal['COO'] = attr()


class Company(BaseXmlModel, tag='Company', nsmap=NSMAP):
    class CompanyType(str, Enum):
        PRIVATE = 'Private'
        PUBLIC = 'Public'

    trade_name: str = attr(name='trade-name')
    type: CompanyType = attr()
    founder: Dict[str, str] = element(tag='Founder')
    founded: Optional[date] = element(tag='Founded')
    employees: conint(gt=0) = element(tag='Employees')
    website: HttpUrl = element(tag='WebSite')

    industries: Industries = element(tag='Industries')

    key_people: Tuple[CEO, CTO, COO] = wrapped('key-people', element(tag='person'))
    headquarters: Headquarters
    socials: List[Social] = wrapped(
        'contacts/socials',
        element(tag='social', default_factory=list),
        ns='co',
        nsmap=NSMAP,
    )

    products: Tuple[Product, ...] = element(tag='product', ns='pd')


xml_doc = pathlib.Path('./doc.xml').read_text()

company = Company.from_xml(xml_doc)

json_doc = pathlib.Path('./doc.json').read_text()
assert company == Company.model_validate_json(json_doc)

JSON#

Since pydantic supports json serialization pydantic-xml can be used as xml-to-json transcoder:

...

xml_doc = pathlib.Path('./doc.xml').read_text()
company = Company.from_xml(xml_doc)

json_doc = pathlib.Path('./doc.json')
json_doc.write_text(company.json(indent=4))

doc.json:

{
    "trade_name": "SpaceX",
    "type": "Private",
    "founder": {
        "name": "Elon",
        "surname": "Musk"
    },
    "founded": "2002-03-14",
    "employees": 12000,
    "website": "https://www.spacex.com",
    "industries": [
        "communications",
        "space"
    ],
    "key_people": [
        {
            "name": "Elon Musk",
            "position": "CEO"
        },
        {
            "name": "Elon Musk",
            "position": "CTO"
        },
        {
            "name": "Gwynne Shotwell",
            "position": "COO"
        }
    ],
    "headquarters": {
        "country": "US",
        "state": "California",
        "city": "Hawthorne"
    },
    "socials": [
        {
            "type": "linkedin",
            "url": "https://www.linkedin.com/company/spacex"
        },
        {
            "type": "twitter",
            "url": "https://twitter.com/spacex"
        },
        {
            "type": "youtube",
            "url": "https://www.youtube.com/spacex"
        }
    ],
    "products": [
        {
            "status": "running",
            "launched": 2013,
            "title": "Several launch vehicles"
        },
        {
            "status": "running",
            "launched": 2019,
            "title": "Starlink"
        },
        {
            "status": "development",
            "launched": null,
            "title": "Starship"
        }
    ]
}