|
Newsletters
|
|
|
|
|
Content Management and Distribution Using XML
Parsing the XML
It seems like every language has a different way of parsing XML. Microsoft's implementation is one of the
better ones we've found. It allows for retrieval of node lists, the number of child nodes per node, and logical data
extraction, to name a few. With this we've managed to walk through our XML tree and get all the data we want with a
simple for loop. To keep our code organized, we've created a class for our articles:
Class xArticle
Public Heading
Public Link
Public Summary
Public Paras()
End Class
This simple class houses each part of our article as a public variable. That way, we can use an array and fill it with all of
our article classes as we parse the XML, so when we want to output it, it's in a nice, easy-to-extract format. But before we
send our XML news document to our parsing function, we pull out the non-repeating data out with the convenient .selectSingleNode method:
' Get the non-article data from the XML Document
Set sLogo = XMLDoc.selectSingleNode("NEWS/LOGO")
sLogo = sLogo.text
Set sDate = XMLDoc.selectSingleNode("NEWS/DATE")
sDate = sDate.text
Now we need to go through the rest of the XML and pull out our article data. Here's the ParseXML function:
ParseXML function
Private Sub ParseXML(oXMLDoc)
' This is the main XML parsing function.
' It creates a new Article object with the appropriate data
' and fills an array (arArticles) with the objects
Dim XMLNodes ' Article Nodes
Dim XMLCNodes ' Article Child Nodes
Dim arPara() ' Array for paragraphs
' Get all the nodes
Set XMLNodes = oXMLDoc.getElementsByTagName("ARTICLE")
Dim XMLPNodes ' Paragraph Nodes
Dim i ' generic counter
For i = 0 to (XMLNodes.length - 1)
' cycle through articles
iACtr = i + 1 ' increment article counter
ReDim Preserve arArticles(iACtr)
Set arArticles(iACtr) = New xArticle ' create a new article object
' get article child nodes
Set XMLCNodes = XMLNodes.item(i).childNodes
Dim j
For j = 0 to (XMLCNodes.length - 1)
Select Case XMLCNodes.item(j).nodeName
Case "HEADING"
arArticles(iACtr).Heading = XMLCNodes.item(j).text
Case "LINK"
arArticles(iACtr).Link = XMLCNodes.item(j).text
Case "SUMMARY"
arArticles(iACtr).Summary = XMLCNodes.item(j).text
Case "PARAGRAPHS"
Set XMLPNodes = XMLCNodes.item(j).childNodes
Dim p ' paragraph counter
For p = 0 to (XMLPNodes.length - 1)
ReDim Preserve arPara(p+1)
arPara(p+1) = XMLPNodes.item(p).text
Next ' p
arArticles(iACtr).Paras = arPara
End Select
Next ' j
Next ' i
End Sub
This function is run no matter what output is chosen. All we're doing here is walking through the XML tree, creating a new Article class
for every article in the XML news document, setting the variables to the appropriate data, and storing it in a global arArticles array. There
are about as many ways to walk an XML tree as there are DTDs, so we chose one of the simpler methods of using a For loop based on the NodeList.length() property.
We set a variable called XMLNodes to a node list returned by .getElementsbyTagName passing in "ARTICLE".
What we end up with is a node list of all the ARTICLE nodes. Since the rest of our data is child nodes of ARTICLE, we can loop through the
node list, pulling data out (via .text) of the XML document and putting it into the public variables of our class when we run across a .nodeName we want.
When we find a nodeName called "PARAGRAPHS", we do a nested loop to insert each paragraph into an array called arPara, which we set to the Paras() array in our class
so we can output any number of paragraphs as called for. And that's all the XML magic we need. Now that we've got the data, let's format it.
Next: Output as [simple]HTML
|
|
|