A Sample XML Document
Explaining what namespaces are is difficult without providing a sample XML document to illustrate the concept. Therefore, here's a very simple XML document that does not use namespaces. In a moment, I discuss how to add namespaces. <invoice id="14561-07">
<heading>
<shipto>
<name>Scott Klement</name>
<address>
<street>123 Sesame St.</street>
<city>Anytown</city>
<state>WI</state>
<country>USA</country>
</address>
</shipto>
</heading>
<Body>
<item sku="54321-A4">
<description>Lovely Green Widget</description>
<price>50.00</price>
</item>
</Body>
</invoice>
For anyone new to XML, here's a quick syntax review:
-
Each XML element begins with a tag like <name> and ends with the same thing preceded by a slash, such as </name>. So in the preceding example, the words "Scott Klement" are associated with the <name> tag because they're between it and its associated end tag.
-
Some XML tags have "attributes" that further describe the tag. In the preceding example, the <invoice> tag has an attribute that says that the invoice ID (or "invoice number," as we more commonly call it) is 14561-07.
-
XML tags can contain other XML tags inside them. The <address> tag contains several other tags, each one a part of the address. The <invoice> tag contains the entire invoice.
What Is a Namespace?
You've created your XML invoice, as shown in the example, and all is going well when you suddenly get a new request. The request involves sending this invoice to a customer. The customer requires you to insert your invoice's XML inside another XML document to use as an "envelope." Here's a very simple example of what an envelope might look like: <Envelope>
<Body>
...XML data for invoice is inserted here...
</Body>
</Envelope>
Now you have a conflict. When the XML for the invoice is inserted, there will be two different <Body> tags. In one place, <Body> symbolizes the body of a network transmission. In the other, it symbolizes the body of an invoice. The two names conflict. You need a way of differentiating one from the other.
A namespace is simply a prefix that distinguishes all the XML tags that come from a particular schema. For example: <net:Envelope>
<net:Body>
<inv:invoice id="14561-07">
<inv:heading>
<inv:shipto>
<inv:name>Scott Klement</inv:name>
<inv:address>
<inv:street>123 Sesame St.</inv:street>
<inv:city>Anytown</inv:city>
<inv:state>WI</inv:state>
<inv:country>USA</inv:country>
</inv:address>
</inv:shipto>
</inv:heading>
<inv:Body>
<inv:item sku="54321-A4">
<inv:description>Lovely Green Widget</inv:description>
<inv:price>50.00</inv:price>
</inv:item>
</inv:Body>
</inv:invoice>
</net:Body>
</net:Envelope>
In this example, all the XML tags related to the network transport envelope schema begin with "env," and all the elements related to the invoice schema begin with "inv." That way, it's easy to tell them apart. It's easy to see that <inv:Body> is different from <net:Body>.
However, there's still a problem! The prefixes make clear which elements are which, but what about conflicts between the prefixes themselves? Perhaps two different customers using two different network schemas might both use "net" as their prefix. Or perhaps one software package on your system designates a prefix of "inv" for inventory, and another package uses that same prefix for invoice. Something needs to be done to keep each namespace separate, so that anyone, anywhere, can create an XML namespace without it conflicting with anyone else!
Think about that for a moment. Anyone can create his or her own XML schema. Anyone can, therefore, create his or her own prefix. How on earth can you guarantee that none of them conflict?
The solution is to prefix each namespace definition with the creator's TCP/IP domain name. Everyone who has a domain name registered on the Internet (which is just about every business these days) can use this "web address" to keep their XML prefixes separate from everyone else's.
For example, you could prefix the <Body> tag from the previous examples with a TCP/IP domain name unique to the document at hand. http://www.scottklement.com/xml/schemas/invoice:Body
vs.
http://www.systeminetwork.com/klement/network:Body
In the preceding example, I don't have to worry about anyone else using the same prefixes as me, because I've included my complete internet domain name in the start of the prefix! Because I'm the only one who has that domain name, there's no chance of a naming conflict.
Note: The preceding namespaces need not point to an actual web page. The domain name prefix is added purely to keep them unique from anyone else's prefixes.
Imagine using prefixes like that on every single XML tag. Ack! Adding the domain names made them unique, but it created another problem. Who wants to use a prefix that's, for example, 53 characters long on every XML tag?! Wow. That would make for a very ugly and ungainly XML document.
The final solution is to create a short placeholder for each unique name and use that placeholder as the prefix. That's what a namespace is. It's a unique identifier associated with a short prefix that can be placed before any XML elements. This is done using the "xmlns" (XML namespace) attribute on an XML tag. The xmlns attribute has the following syntax: xmlns:prefix="Unique ID for namespace"
For example, here's the previous invoice example using an XML namespace: <net:Envelope xmlns:net="http://www.systeminetwork.com/klement/network">
<net:Body>
<inv:invoice id="14561-07" xmlns:inv="http://www.scottklement.com/xml/schemas/invoice">
<inv:heading>
<inv:shipto>
<inv:name>Scott Klement</inv:name>
<inv:address>
<inv:street>123 Sesame St.</inv:street>
<inv:city>Anytown</inv:city>
<inv:state>WI</inv:state>
<inv:country>USA</inv:country>
</inv:address>
</inv:shipto>
</inv:heading>
<inv:Body>
<inv:item sku="54321-A4">
<inv:description>Lovely Green Widget</inv:description>
<inv:price>50.00</inv:price>
</inv:item>
</inv:Body>
</inv:invoice>
</net:Body>
</net:Envelope>
In the preceding example, I've associated both "net" and "inv" with unique identifiers. The fact that they refer to Internet domain names uniquely identifies the schema that they originated with. The fact that they still use short, easy-to-read prefixes keeps the document from getting ungainly.
As far as an XML parser is concerned, I can change the prefixes to anything I want without changing the meaning of the document, as long as it still refers to the same unique ID. For example, the following two documents have exactly the same meaning: <net:Envelope xmlns:net="http://schemas.xmlsoap.org/soap/envelope/">
<net:Body>
...web service parameters go here...
</net:Body>
</net:Envelope>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
...web service parameters go here...
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
The fact that one of them uses the prefix "net," and one uses the prefix "SOAP-ENV" is inconsequential. They are completely identical, because both prefixes point to the same unique identifier for the prefix, and therefore they mean exactly the same thing. This way, you can always avoid having a duplicate prefix name. If a duplicate ever arises, you simply change it to a different prefix without changing the unique identifier, and you've solved the problem.
Parsing XML Namespaces with Expat
Note: There are many other XML parsers available, but Expat is the one I use, so it's the one I'm able to write about. If you use something else in an i5/OS programming language and would like to write an article about it, please contact me at programmingtips@systeminetwork.com.
I've written several articles about using Expat, an open-source XML parser, from an RPG program. If you haven't read the previous articles, or you need to brush up on how to use Expat, you might want to read these articles. The following link goes to one of the articles, and that article contains links to the others: http://www.systeminetwork.com/article.cfm?id=53061
The way to make Expat work with namespaces is similar to the way Expat is used for ordinary documents. The primary difference is the routine you call to "create a parser object." (For anyone unfamiliar with object-oriented terminology: Creating a parser really refers to creating a temporary work space for the parser to use for its internal variables related to this particular XML document.)
In previous articles, I told you to use XML_ParserCreate(). To enable namespace support, delete the line of code for XML_ParserCreate() and replace it with a call to XML_ParserCreateNS(): p = XML_ParserCreateNS(*OMIT: x'0C');
if (p = *NULL);
// Creation of parser failed...
endif;
The XML_ParserCreateNS() API accepts two parameters:
-
encoding = The character set that the XML document is encoded with, or *OMIT if you want Expat to auto-detect the encoding. This is identical to the parameter you pass to the XML_ParserCreate() API.
-
nsSeparator = Specifies the character that Expat uses to separate the namespace identifier from the XML element name. More information is provided in the next section of this article.
The XML_ParserCreateNS() API returns a pointer to the temporary space, just as the XML_ParserCreate() API does. If an error occurs, it returns *NULL.
The most important difference between using XML_ParserCreateNS() and XML_ParserCreate() is that the former API enables namespace processing when it calls the subprocedures that you've registered as Start XML element handlers or End XML element handlers.
How Your Handlers Change with Namespaces
When namespace processing is enabled, Expat adds the unique namespace identifier (i.e., the long identifier with the domain name in it) as a prefix to the element name. When it calls your start handler subprocedure or your end handler subprocedure, the element name has this namespace prefix added to it.
To keep the element name separated from the namespace prefix, Expat inserts a separator character. This is the character that you specified in the second parameter of the XML_ParserCreateNS() API.
I typically use x'0C' as a namespace separator because it never appears in either a unique namespace identifier or an XML element name. When I find an x'0C' in the element name, I know that it must be a namespace separator.
For example, I have the following code in the subprocedure that I've registered as the "start XML handler." It looks for a namespace identifier in the element Name, and if one is found, it splits it off into a nameSpace field: D elemName s 400C varying
D nameSpace s 400C varying
D nspos s 10I 0
/free
len = %scan(U'0000':elem) - 1;
elemName = %subst(elem:1:len);
// -----------------------------------------
// If the namespace separator is found
// split the nameSpace from the elemName
// -----------------------------------------
nspos = %scan(U'000C': elemName);
if (nspos > 0);
nameSpace = %subst(elemName:1:nspos-1);
elemName = %subst(elemName:nspos+1);
else;
nameSpace = U';
endif;
At this point in the program, when Expat is parsing the <Envelope> XML element, the following values are set: elemName = Envelope
nameSpace = http://www.systeminetwork.com/klement/network
Similarly, when it gets to the <invoice> XML element, the same variables will have these values: elemName = invoice
nameSpace = http://www.scottklement.com/xml/schemas/invoice
Because Expat expands the prefix into the full, unique namespace identifier, it doesn't matter which of the short prefix names was used in the document. When I want to check for a prefix, I check using the long name, which is always the same for a given schema.
For example, if I want to write code that counts the number of "item" elements in the "invoice" namespace, I can code the following: D INVOICE_NAMESPACE...
D C %ucs2('http://www.scottklement-
D .com/xml/schemas/invoice')
.
.
if ( nameSpace = INVOICE_NAMESPACE
and elemName = %ucs2('item') );
count = count + 1;
endif;
As the preceding code demonstrates, I like to use named constants for the namespaces that my program is expecting. That way, I don't have to type the whole unique namespace ID in every single place that I want to check it. Plus, I'm far less likely to make typos this way!
Code Download
To demonstrate everything that I explain in this article, I provide a sample program intended to be compiled against my port of Expat 2.0.0. You can download my sample program and the demonstration XML file from the following link: http://www.pentontech.com/IBMContent/Documents/article/54265_181_Namespaces.zip
You can download my i5/OS port of the Expat open-source XML parser from the following link: http://www.scottklement.com/expat/
20-03-2007 om 11:43
geschreven door Qmma 
|