Elements#
Primitive types#
Field of a primitive type marked as pydantic_xml.element()
is bound to a sub-element text.
Parameter tag
is used to declare a sub-element tag from which the text is extracted.
If it is omitted field name is used (respecting pydantic
field aliases).
class Company(BaseXmlModel, tag='company'):
founded: dt.date = element()
website: HttpUrl = element(tag='web-size')
<company>
<founded>2002-03-14</founded>
<web-size>https://www.spacex.com</web-size>
</company>
{
"founded": "2002-03-14",
"website": "https://www.spacex.com"
}
Model types#
Field of a model type marked as pydantic_xml.element()
is bound to a sub-element.
Then the sub-element is used as a root for that sub-model. For more information
see model data binding.
Parameter tag
is used to declare a sub-element tag to which the sub-model is bound.
If it is omitted the sub-model tag
setting is used.
If it is omitted too field name is used (respecting pydantic
field aliases).
So the order is the following: element tag, model tag, field alias, field name.
class Headquarters(BaseXmlModel, tag='headquarters'):
country: str = element()
state: str = element()
city: str = element()
class Company(BaseXmlModel, tag='company'):
headquarters: Headquarters
<company>
<headquarters>
<country>US</country>
<state>California</state>
<city>Hawthorne</city>
</headquarters>
</company>
{
"headquarters": {
"country": "US",
"state": "California",
"city": "Hawthorne"
}
}
Namespaces#
You can declare an element namespace passing parameters ns
and nsmap
to pydantic_xml.element()
where ns
is the element namespace alias and nsmap
is a namespace mapping:
class Company(BaseXmlModel, tag='company'):
founded: dt.date = element(
ns='co',
nsmap={'co': 'http://www.company.com/co'},
)
website: HttpUrl = element(tag='web-size')
<company>
<co:founded xmlns:co="http://www.company.com/co">2002-03-14</co:founded>
<web-size>https://www.spacex.com</web-size>
</company>
{
"founded": "2002-03-14",
"website": "https://www.spacex.com"
}
Namespace mapping can be declared for a model. In that case all fields inherit that mapping:
class Company(
BaseXmlModel,
tag='company',
ns='co',
nsmap={'co': 'http://www.company.com/co'},
):
founded: dt.date = element(ns='co')
website: HttpUrl = element(tag='web-size', ns='co')
<co:company xmlns:co="http://www.company.com/co">
<co:founded>2002-03-14</co:founded>
<co:web-size>https://www.spacex.com</co:web-size>
</co:company>
{
"founded": "2002-03-14",
"website": "https://www.spacex.com"
}
Namespace and namespace mapping can be also applied to model types passing ns
and nsmap
to pydantic_xml.element()
. If they are omitted model namespace and namespace mapping is used:
class Headquarters(
BaseXmlModel,
tag='headquarters',
ns='hq',
nsmap={'hq': 'http://www.company.com/hq'},
):
country: str = element(ns='hq')
state: str = element(ns='hq')
city: str = element(ns='hq')
class Company(BaseXmlModel, tag='company'):
headquarters: Headquarters
<company>
<hq:headquarters xmlns:hq="http://www.company.com/hq">
<hq:country>US</hq:country>
<hq:state>California</hq:state>
<hq:city>Hawthorne</hq:city>
</hq:headquarters>
</company>
{
"headquarters": {
"country": "US",
"state": "California",
"city": "Hawthorne"
}
}
Elements search mode#
A model supports several element search strategies (modes). Each strategy has its own pros and cons.
Strict (default)#
An element to which a field will be bound is searched sequentially one by one (without skipping unknown elements). If the tag of a next element doesn’t match the field tag that field is considered unbound. This mode is used when strong document validation is required. If you parse a very large document it is the best choice because it works in predictable time since it doesn’t require any look-ahead operations.
class Company(BaseXmlModel, tag='Company', search_mode='strict'):
founded: str = element(tag='Founded')
website: str = element(tag='WebSite')
Error
code raises an exception because of incorrect field order
<Company>
<WebSite>https://www.spacex.com</WebSite>
<Founded>2002-03-14</Founded>
</Company>
{}
Ordered#
An element to which a field will be bound is searched sequentially skipping unknown elements. If the tag of a next element doesn’t match the field tag that element is skipped and the search continues. This mode is used when element order matters but unexpected (or irrelevant) elements could appear in a document.
class Company(BaseXmlModel, tag='Company', search_mode='ordered'):
founded: str = element(tag='Founded')
website: str = element(tag='WebSite')
<Company>
<Founded>2002-03-14</Founded>
<Founder name="Elon" surname="Musk"/>
<WebSite>https://www.spacex.com</WebSite>
</Company>
{
"founded": "2002-03-14",
"website": "https://www.spacex.com"
}
Warning
This mode could lead to some unexpected results. For example the following model:
class Model(BaseXmlModel):
field1: Optional[str] = element(tag='element1')
field2: str = element(tag='element2')
field3: str = element(tag='element1')
will fail for the following document:
<Model>
<element2>value</element2>
<element1>value</element2>
</Model>
because the first field will be bound to the second element (the algorithm looks ahead until the first match found, which is the second element) and the second field will not be bound to any element.
Unordered#
An element to which a field will be bound is searched in random order.
This mode is used when element order doesn’t matter.
The time complexity of this strategy is worst case is
O(F*E)
where F
- is the number of fields, E
- the number of sub-elements.
class Company(BaseXmlModel, tag='Company', search_mode='unordered'):
founded: str = element(tag='Founded')
website: str = element(tag='WebSite')
<Company>
<WebSite>https://www.spacex.com</WebSite>
<Founded>2002-03-14</Founded>
</Company>
{
"founded": "2002-03-14",
"website": "https://www.spacex.com"
}