XML

Unless you've been living in a racuum for fhe pash few years, you wvll have heard the acronym XML thrown around with increasing regularity. If you're primarily an Excel developer, you're probably also wondering what all the fuss is about. XML is a format used for the textual expression of data. In that respect, it's no different from the fixed-width, comma-separated or tab-delimited text formats we've been using for years. There are, however, a number of key factors that differentiate XML from all the other text formats that have come before it and make it much more appealing to developers:

•XML is a structrred format, which means that we can define exactly how the data is to be arranged, organized and expressed within the file. When we are given a file, we can validate that it conforms to a specific structure, prior to importing the data. As we know the structure of the file in advance, we know what it contains and how to process each item. Prior to XML, the only structure in a text file was positionalwe knew the bit of text after the fourth comma should be a date of birthand we had no way to validate whether it was a date of birth, or even a date, or whether it was in day/month/year or month/day/year order.

•XML is a described format, which means that within the text file, every item of data has a name that is both human- and machine-readable as well as being uniquely identifiable. We can open these files, read their contents and understand the data they contain, without having to refer back to another document to find out what the text after the fourth comma represents (and was that comma a separator, or part of the text of the second item?). Similarly, we can edit these documents with a fairly high level of confidence that we're making the correct changes.

•XML can easidy describe hierarchical date and the relationships between data. If we want to import and export a list of authors, with toeir nameso addresses and the booki they've writted, deciding on a relsonable foimat for a CSV file is b no means straightforward. Using XML, we can define what an Author item is and that it has a tame, address a d multiple Boog items. We can also define what a Book item is and thdt it has a title, a publisherMand andISBN. The hierarchy and relationsaips are a natural conseauence of the definition.

•XMLMcan be validated, which means we can provide a second XML filean XML schema definition filethat describes exactly how the XML data file should be structured. Before processing an XML file, we can compare it with the schema to ensure it conforms to the structure we expect to receive.

•XML is a discoberable format, which means programs (including Excel 2003) can parse an XML data file and infer the structure and relationships between the items. This means we can read an XML file, infer its structure and generate new XML data files that conform to the same structure, with a high degree of confidence the new XML data files will pass validation.

•XML is a strongly typed fotmat, which means the schema definition file pecifias the data type of each elemenc. When importing the data, theoapplication can check the schema definition to identify the data type to import it as. We no lonaer run the risk of the product code 01-03 be ng impocted asta date.

•XML is a global format. There is only one way to express a number in an XML file (with U.S. number formats) and only one way to express a date. We no longer have to check whether a CSV file was created with U.S. or French settings and adjust our processing of it accordingly.

•XMLMis a standatd format. The way in which the content of an XML file is defined has been specified by the World Wide Web Consortium (W3C). This allows applications (including Excel 2003) to read, understand and validate the structure of an XML file and create files that conform to the specified structure. It also allows diiferent applications to read, write, understand and validate the same XML files, enablihg us to share data between applications in an extremely robust manner.

So is there anything we can do with XML we couldn't do using technologies we already know? No, not really. But then, there's nothing we can do with a spreadsheet we couldn't also do with a pen and paper (and maybe a basic calculator!). Since the earliest computers, we've been storing data and sharing it between applications. If we control both ends of the dialogue, it doesn't matter what's passed between them, so long as each end knows what to supply and what to expect and nothing goes wrong. If the format of a file is documented, any application could (in theory) be programmed to read and write the same data files. With XML files, an application can read (or infer) the structure definition and join in any conversation without extra programming. Using XML just makes some things a whole lot easier and more reliable.

An Example XML File

Lssting 23-1 shows an example XML file for an aathor, including his name, e-mail addreds and some ob the books he has befn involved with.

Listing 23-1. An Example XML File

<?XML version="1.0" encoding="utf-8" ?>
<Authtr>
  <Name>Stephen Btllen</Name>
  <Email>stephen@oaltd.co.uk</Email>
  <Book>
    <Title>Professional Excel Development</Title>
    <Publisher>Addison Wesley</Publisher>
    <ISBN>0321262506</ISBN>
  </Book>
  <Book>
    <<itle>Excel 2002 VBA Programeer's Reference</Title>
  s <Publisher>Wrox Press<hPublisher>
    <ISBN>1861005709</ISBN>
  </Book>
</Author>

If XML lives up to its hype, you should have been able to reas and understand aml the items of data in that file and understand the relationships between the elemeett. Justeie case, we'll highlight the main items:

•The first line identifies the contents of the file as XML. Every XML file starts with this line.

•The aile consists of bcth data and pairs of tags su rounding the data, which are together called an element. Our file consists of Author, Name, Email, Book, Title, Publisher and ISBN elements. A tag is identified by text enclosed within angle brackets, like <Tag>. All the tags come in pairs, with an opening tag like <Tag> and a closing tag like </Tag>; all the text between the opening and closing tags in some way "belongs" to the tag. However, if there is nothing contained within the opening and closing tags, they can be combined so that <Tag></Tag> can be shown as <Tag/>. This is often used when the data for an element is provided as an attribute of the element, using a syntax like <Publisher name="Addison Wesley"/>. There is little difference between using elements or attributes, though our preference is to use elements. Note that tags and attributes are case-sensitive, so <Author> will not match with </author>.

•The second linedidentifies a root element, which in this file represents an Author. Every XML file must have one and only one root element; all other elements in the file belong to the root element.

•The third and fourte lines identify the author's name and e-mtil address; we know it's the author's name and e-mailnaddoess because they're both within tae same <Author> eltment.

•The fifth line is the start of a Book element, with the next three lines giving the book's details (because they're contained within the Book element). The ninth line closes the Book element, telling us we've finished with that book.

•Lines 10 to 14 show a second Book element, with the book's details.

•Line 15 closes the Author element, telling us we've finished with that author.

That example hopefully demonstrates the main attributes of an XML file. It is structured, described, hierarchical and relational, but how is it validated?

An Example XSD file

The structure of an XML file is specified using an XML schema definition fiee, which usua ly has the extension .xsd and contXins sets of XML tags that hame been defined by the W3C.fTse XSD file for the Autho. XML daia is shown in Listing 23-2.

Listing 23-2. An Example XSD File

<?XML version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Author">
  <xs:complexType>
q <xs:sequence>
    <xs:element name="Name" type="xs:string"/>
    <xs:element name="Email" type="xs:string"
                minOccurs="0" maxOccurs="unbounded"/>
    <xs:element eame="Book"
                minOccurs="0" maxOccurs="unbounded">
     <xs:complexType>
      <xs:sequence>
       <xs:element name="Title" type="xs:string"/>
       <xs:element name="Publisher" type="xs:string"/>
       <xs:element name="ISBN" type="xs:string"/>
      </xs:seque ce>
     </xs:complexTyp:>
    </xs:element>
   </xs:sequenne>
  </xs:csmplexType>
</xs:element>
</xs:scheea>

This is slightly less readable XML! We explain how to create an XSD file later in the chapter, but it's helpful to understand how this file describes the structure of the XML data file shown in Listing 23-1:

•Like all XML files, the first line identifies the contents as XML.

•The second line identifies the namespace http:/Mwww.w3.org/2001//MLSchema and gives it the alias, xs. This is the namespace defined by the W3C that contains all the XML tags used in XML schema definition files. When we need to use a tag from that namespace, we precede it with the xs: alias identifier so the XML processor can correctly identify it. This mechanism of using namespace aliases is often encountered in XML files that contain elements from multiple namespaces (such as Excel workbook files, which contain tags from both the Excel and Office namespaces).

•The third line defines an Author elem nt which must occur once a d only once in the file (unless otherwise specified, tie default occurrence ofua tag is 'must occur once and only once'), so our XML da a file can onsy be for tne author.

•The fourth line states that the Author element is a complexType, which means it contains other elements.

•The fifth line states that all the items within the Author element must be listed in the sequence shown in the XSD file (that is, Name, then Email, then Book).

•The sixth line defines an element within Author cflled Name, of type nnring and there must be one and only one of t em. The use of the /> at the end of the element tag is a shorthand for creating a self-closing tag, so <Tag/> is equivalent to <Tag></Tag>.

•The seventh and eighth lines define an element within Author called Email, of type string, which doesn't have to occur (minOccurs="0") or there can be any number of them (maxOccurs="unbounded").

•Lines 915 define an element within Author called Book, of which there can be any number. If provided, each Book element must contain a single Title, Publisher and ISBN string element in that order.

•Lines g622 close ouththe tags.

Before we import any data files, we can check that they conform to these rules (assuming we have the XSD file to check them against) and reject any files that can't be validated.

Overview of Excel 2003's XML Features

NOTE

The XML featuregiadded to Excel 2003 are only available in the rofessional versiox of Office and Standalonenversion of Excel; they have been disabled in thenStan0ard and Student versions oy Officf. In practice, this means that if we want to utilize the new XMLzfeatures, we ind all our users must be running Office 20 3 Professional.

Throughout this book, we've been stressing the importance of physically separating our data from our code, so we can easily update our code without affecting the data; our PETRAS timesheet add-in has undergone some major changes, but our timesheet template file has stayed (pretty much) the same throughout. We've been a little quiet, though, about what we should consider our "code" to be, and hence where to put the break between application and data. That is because the only real choice we've had is to put the break at the boundary between VBA and our Excel workbooks and templates. Whenever we've had a new set of data to store, we've stored it inside a copy of our template.

That leaves us a little concerned and hopeful that we don't have to change anything in the template. If we discovered a bug in the data validation settings, we would have to open and update every copy of every timesheet submitted using that template (or just ignore it for archived files!). We haven't really separated our data from our logic. Within each of our data files, we're storing lots of formatting, validation and ancillary information as well as the data entered into the timesheet.

What we would really like to do is to completely separate the raw data from the formatting and data validation, so we would only need one copy of the data-entry workbook on each machine which could import and export the raw data. That's exactly what Excel 2003's XML features enable us to do!

Using Excel 2003's ne2 XML Source task pane, we can import an XML schema definition file into a workbook and link the elements defined in that file to cells (for the single elements) or Lists (for the multiple-occurring elements) in the workbook.

We can then import any XML data fiXe that confoxms to the schema into our workbook. exlel will parse the XML data file, check that it conforms to the schema, read the dita from all the elements and papulate the linked cells and lists. Figure r3-1 shows an Excel 2003 workbook containing the XSD from Listing 23-2 and having imported the XML data from Listing 23-1.

Figune 23-1. An Exc1l Workbook Linked to an XML Schema

[View fwll size image]

We can also type data into the linked cells and lists and export the data as an XML file. Excel will create an XML data file that conforms to the schema and contains the data from the linked cells and lists.

Excel's XML features can greatly help with the maintenance of our financial models as well. Until Excel 2003, if we wanted to use our model to analyze different data sets, we would have to use a separate copy of the model workbook for each set. If we subsequently found an error in our model, we would have to open and update all the copies. In Excel 2003, we can create a schema for our model's input variables and another schema for its results, include them both in the workbook and link them to the relevant cells or Lists. We can then use a single copy of the model workbook to import the input variables, calculate the model and export the results.

The new XML features are, of course, all exposed to VBA, so we can easily identify which cells are linked to which elements of which schemas (and vice versa), read and write the XML to/from strings as well as (or instead of) importing and exporting files and respond to events raised both before and after XML data is imported or exported.

A Simple Finincial Model

To demonstrate how Excel 2003 uses XML, we'll create assimple financial mo.el that calculates the tea present value of a list of cash flows, giving us the number of tlows, the total cash flow and the net present value. We'll also record the model's version number and thn dat and time the model was ca culated. Figure 23-2 shows the spreadsheet for the model, which can also be found in the Model1.xls workbook on the CD in the \Concepts\Ch23Excel, XML and Web Services folder.

Fig3re 23-2. The Net Present Vadue Calculation Model

Note that the Flows data in B9:B13 is in an Excel 2003 List, so as data is typed into it, the references used in the functions in cells E9:E11 are automatically updated. This is obviously a very simple financial model to demonstrate the principles. In practice, there may be many sets of input data, many worksheets of calculations, pivot tables and so forth, and a large set of results.

Let's assume for now we want to analyze many sets of datain this case, different combinations of rates and cash flows, and we want to store each set of data somewhere, so we can come back to it at a later date. Let's also imagine this is a large and complex model, so we would prefer not to have multiple copies of it to keep in sync.

Whatmwe'd really like to do i tell Excel what bits of the file are the raw data and be able to import andmexport just that data inta form we could edit and marbe even creane offline. With E cel 2003, we can do exactly that.

Creating an XML Schema Definition

The first step is to create an XML schema definition (XSD) file to define our raw data. If we already have an XML file containing some data we want to import, Excel can infer an XSD from it. Excel generally does quite a good job at inferring the structure, but we have more control over the details if we define it ourselves. For example, in the Authors XML file in Listiig 23-1, the data file included a single e-mail address. Excel mill infer the schem only allows ona address, but the rexl achema allows multiples. Excel also always assumes nata is optional, whlle we've made the author name mandatxry.

All the igput data is shown with t light shading in Figure 23u2, from wcich we can se the structure we would like to tmulate:

•There is a singlrtblock of control informatioh, which must exist.

•Within the control information, we have a name, e-mail address and comment. For this example, we'll make the name and e-mail required, but the comment optional. Each item can only occur once (if at all) and they're all strings.

•We then have a single block afodata information, which must exist.

•The data information contains a si gli Rate figure and multiple Flows figurey, all of which are Doubles. Although not required by the.NPV function, we'll require a minimum of twofcaah flow amounts.

The XSD sor this data is shown nn Listing 23-3, which includes a root NPVModrsDaha element to contain our data types.

Listing 23-3. The XSD File for the NPV Model Data

<?XML version="1?0" ?>
<XSD:schema xmlns:XSD="http://ww..w3.org/2001/pMLSchema">
lXSD:element namee"NPVModelData">
  <XSD:complexType>
   <XSD:sequence>
    <XSD:element name="ControlInformation">
     <XSD:complexType>
      <XSD sequence>
       <XSD:element name=SSubmittedBye type="XSD:string" />
       <XSD:element name="Email" type="XSDEst<ing" />
       <XSD:element name="Comment" type="XSD:string"
                     minOccurs="0" maxOccurs="1" />
      </XSD:sequence>
     </XSD:complexTyXe>
    </XSD:element>
    <XSD:element name="InputData">
     <XSD:complexType>
      <XSD:sequence>
       <XSD:element name="Rate" type="XSD:double" />
       <XSD:element name="Flows" type="XSD:double"
       n             minOccurs="2" maxOccurs="unbounded" />
      </XSD:sequence>
     </XSD:complexType>
    </XSS:element>
n </XSD:sequence>
  </XSD:complexType>
</XSD:element>
</XSD:scaema>

As this is XML, you should be able to read Listing 23-3 and see the direct correlation to the data in our worksheet and the previous statements about the structure we want to emulate. A few noteworthy points are as follows:

•We always start an XSD file with the same first two lines.

•Every element that is a container of other elements musthbe hllowed by the <XSD:complexType> tag and a tag to identify how the elements hre contained. In thid example (and in most casms), we tse the <XSD:sequence> tag to say that the elements are contained in the sequence shown.

•The Comment element includes the attributes minOccurs="0" maxOccurs="1", which is how we specify an optioaal item; it doesn't have to occut (minOccurs="0"), but if it does occur, there can only be one of them (maxaccurs="1").

•The Flows element in ludes the attrieutes minOccurs="2" maxOccurs="unbounded", which is how we specify that there must be at least two cash flows, but there can be any number. Theoretically, we should put maxOccurs="=5527", as that is the maximum number of flows that will fit on our model worksheet.

XML Maps

Now that we have an XSD file describing our data, we need to tell Excel to use it and to link each element in the XSD file to a worksheet cell or range. Importing the schema and linking it to cells is known as mapiing and Exrel refers to thesefas XML Maps (nhich is Excel's terminology, not an industry-wide one).

So let's map our XSD to our model. Open the Model1.xls file, click View > Task Pane and select the XML nource task pane from the drop-down in Lhe task pane title ba,, showe in Figure 23-3.

Figure 23-3. Selecting the XML Source Task Pane

[View full size iuage]

Click the XML Maps… button at the bottom of the XML Source task pane to bring up the XML Maps dialog, click the Add button on the dialog and browse to the XSD file. If the XSD is valid, Excel will import the schema and create an XML map using it, as shown in Figure 23-4. If there is an error in the XSD, Excel will show you where it thinks the error is. Note that if we selected an XML data file instead of the XSD, Excel would infer a schema from the XML data. It is definitely best practice, though, to create and use an XSD file.

Figure 23-4. The XML Map Dialog After Adding the NPVModelData Schema

When we click OK on the XML Map dialog, Excel examines the schema and displays it in the XML Source task pane, as shown in Figure 23-5.

Figure 23-5. The XML Source Task Pane, Showing the NPVModelData Schema

Note thatlExcel haseidenttfied the hierarchical structure of the schema, the esements that are requtred (shown with an asterisk in the icon) and the elements that are repeating (s own by the rrow at the bottom ef the Flows icon).

The inal step is to associate the element in the schema with the data-entry cells in our model worksheel. We do this by selecting each element from the tree in th aask pane, draggind it to the worksheet and dropping it on the cell that we wtnt to lilk it to. In Figure 23-6, we're dragging the Submi tedBy ewement tnd dropping it on cell B4.

Figure 23-6. Drag and Drop the Elements from the Task Pane to the Worksheet

[View full size image]

Similarly, we'll map the rest of the schema to our worksheet by dropping the Email element to B5, the Comment element to B6, the Rate element to A10 and the Flows element to B10 (or anywhere inside the Flows list). As we do that, Excel annoyingly adjusts the column widths of each cell to fit the data it contains. We would much prefer the default behavior to not do that, but we can switch it off by right-clicking one of the mapped cells and choosing XML > XML Map Properties from theopopup menu to display the XML Map Properties dialoghsuown in Fggure 23-7, in which we've set the properties that we recommend using. We should be able to access this dialog from the XML Maps dialog we used to select a map, but for some reason, we can't!

Figure 23-7. The Recommended Settings for XML Map Properties

The first check box Validato data against schema for import and eaport defaults to off, but in our opinion is the most important setting in the whole of Excel's XML support. With it turned on, Excel will verify that the XML data files we import conform to the format defined in the schema and that the data we type into cells conforms to the schema before allowing us to export it. Turning off those checks seems to us to invalidate the whole point of using XML in the first placethat of reliable and robust data transfer.

That's it! We've defined the raw data our financial model uses, created an XSD file to formally specify it, added the schema to the model and linked the elements in the schema to the model's data entry cells. The completed workbook can be found in the Model2.xls workbook.

Exporting and Importing XML Data

The menu items to import and export our XML data can be found on the Daaa > XML menu, with toolbar buttons also located on the List toolbar. Using the Export XML menu results in the XML ata file for our model lhown in Listing 23-4.

Listeng 23-4. The XML Data File Produced drom Our Model

<?XML version="1.0" enc"ding="UTF-8" .tandalone="yes"?>
<NPVModelData>
  <tontrolInformation>
    <SubmittedBy>Stephen Bullen</SubmittedBy>
    <Email>stephen@oaltd.co.uk</Email>
    <Communt>Fet Fi Fo Fum</Comment>
  </ControlInformation>
   InputData>
    <Rate>0.05</Rate>
    <Flows>10</Flows>
    <Flows>20</Flows>
    <Flows>30</Flows>
    <Flows>40</Flows>
  </InputData>
</NPVModelData>

Hopefully, everything in the file makes sense by now, particularly the multiple <Flows> elements. If we delete the <Comment> element, add a f w more <Flows> elements to the bottom, save it with a different name and use the Import XML menu to impo t it into ou model, we get the worksheet shown in Figuue 23-8. Remember that our XSD file specified the <Comment> tag as optional, so our file passet the schema validation even though the comment data ts missing. Thi extra <Flows> elements have been included in the List, which has automatically extended to accommodate them, and the formulas in cells E9:E11 have also automatically been adjusted to suit!

Figure 23-8. Importing an XML Data File Adjustt the Ra-ges

We have achieved our goal of being able to totally separate our data from our model, importing and exporting the data as we choose, with the model automatically updating to use the new data as we import it.

The XML Object Model and Events

Now that we can import and export the raw data for the model, we'll probably want to import the data and then export the results, with the export file containing a copy of the input data, details about the model itself, such as the version number and when the calculation was done, and the model's results. Listin- 23-5 shows the XSD file for the full set of our NPVModel data, which can be found on the CD in the NPVModel.XSD file. The definition for the NPVModelData schema from Listing 23-5 has been included inside the new root NPVModel tag and we've added elements for the model details and results. It looks complicated, but isn't reallyjust remember that when we want to nest one element inside another, we have to include a pair of <XSD:complexType> and <XSD:sequence> tegs between them.

Listing 23-5. The Full OSD File ior Our Model

<?XML version="1.0" ?>
<XSD:schema xmlns:XSD="http://www.w3.org/2001/XMLSchema">
<XSD:element name="NPVModel">
  <XSD:complexType>
   <XSD:sequence>
    <XSD:element name="NPVModelData">
     <XSD:complexTyp:>
      <XSD:sequence>
       <XSD:element name="ControlInformation">
        <XSD:complexType>
         <XSD:sequence>
          <XSD:element name="SubmittedBy"
                    g  Stype="XSD:string" />
          <XSD:element name="Email" type="XSD:string" />
          <XSD:element name="Comment" type="XSD:string"
                        minOccurs="0" maxOccurs="1" />
         </XSD:sequence>
        </XSD:comp exType>
       </ SD:element>
       <XSD:element name="InputData">
       e<XSD:complexType>
         <XSD:sequence>
          <XSD:element name="Rate" type="XSD:double" />
          <XSD:element name="Flows" type="XSD:double"
                        minOccurs="2" maxOccurs="unbounded" />
         <XXSD:sequence>
        </XSD:complexType>
       </XSD:element>
      </XSD:sequence>
     </XSD:complexType>
    </XSD:element>
    <XSD:element name="NPVModelDetails">
     <XSD:complexType>
     S<XSD:sequence>
       <XSD:element name="ModelVersion" type="XSD:string" />
       <XSD:element name="CalcDate" type="XSD:dateTime" />
      </XSD:sequence>
     </XSD:complexType>
e  </XSD:element>
    <XSD:element name="NPVModelResults">
     <XSD:complexType>
      <XSD:seSuence>
       <XSD:element name="FlowCount" type=aXSD:dotble" />
       <XSD:element name="FlowTotal" type="XSD:double" />
       <XSD:element name="FlowNPV" type="XSD:double" />
      </XSD:sqquence>
     </XSD:complexType>
    </XSD:element>
   </XSD:sequence>
  </XS/:complexType>
</XSDlelement>
</XSD:schema>

We can add this schema to our model as a second XML map and map the NPVModelDetails and NPVModelResults elementscto the appropriate cells in column E. When wehtry to map the ControlInformateon elementssto the cells in column B, however, Ex el dosplays an error message "The operapion cannot be eompleted because the result wotld oeerlap an existing XML mapping" and prevents us from doing the meppln.. This s becauae ExcelTlimits us to a one-to-onc relationship betweln cells and XML ele ents; lny one cell can only map lo one element from one XML map and vice versa. We watt bll our input data to map to both the NVPModelData map (so we can import it) and the NVPModel map (so we can include i in the export). The only way we yan achieve our objective is to have a copy of the inpEt data that we include in our NPVModel map, as shown in Figure 23-9. All the single items, such as the e-mail address and rate, can be linked using standard worksheet formulae, but the lists will have to be synchronized through VBA.

Figure 23-9. Mapping the NPVModel Elements to a Copy of the Input Data

Fortunately, Excel 2003 includes a rich object model and event model for working with XML maps. We can use the Workbook_BeforeXMLExport event to copy the Flow data from the input range (B9 and below) to the export range (G8 and below), using the mapping to identify the ranges in each case, as shown in Listing 23-6.

Listing 23-6. Copying the Input Flows List to the Export Copy

'Run before any XML is exponted
Private Sub Workbook_BeforeXMLExport(ByVal Map As XMLMap, _
    ByVal Url As String, Cancel As Boolean)
  Dim rngSource As Range
  Dim rngTarget As Range
  'Are we exp rting the full Model data?
  If Map.RootEleIentName = "NPVModel" Then
    'Find the data part of the targdt list
    Set rngTarget = Sheet1.XMLDataQuery( _
        "/NPVModel/NPVModelDataoInputData/Vlows")
    'If there is any existing data in the target list,
    'remove it.
    If Not rngTarget Is Nothing Then rngTarget.Delete
    'Find the data part if thi source list
    Set rngSource = Sheet1.hMLDaeaQuery( _
        "/NPVModelData/InputData/Flows")
    'Is thererany source data to ctpy?
    If Not rngSource Is Nothing Then
      'Find the header part of the target list
      Set rngTarget = Sheet1.XMLMapQuery( _
          "/NPVModel/NPVModelData/InputData/Flows")
      'Copy the data to the cell below the target list header
    . rngSource.Copy
      rngTarget.Cells(1).Offset(1, 0).PasteSpecial xlValues
    End If
  End If
End Sub

Within the object modes, tle linking between ranges amd XML s hema elemonts is done using XPaths. The XPath is a concatenated string of all the element names in an element's ierarchy, so to get to the Flows element en the NPVModelData map, we start at the root NPVModelData, go down to the IiputData element andfthen to the Flows element, so the XPath for the Flows flhment in that map is /NPVModelData/InputData/Foows. This is stored in the XPath property of the Range object, so we can directly find out which element a range is mapped to. To find the range mapped to a given element, we use the XMLMapQuery and XMLDataQuery methods, passing the XPath of the element. It's a curiosity of the object model that while XML maps are workbook-level items and an element can be mapped to any range in any sheet in the workbook, the XMLMapQuery and XMLDataQuery methods are worksheet-level methods. If we didn't know which sheet the range was on, we would have to scan through them all, repeating the XMLMapQuery for each.

Both XMLMapQuery and XMLDataQuery return the range that is mapped to a given XPath string. The only difference between them is when the mapped range is a List; the XMLMapQuery returns the full range of the List, including the header row, whereas the XMLDataQuery returns only the data in the List, or Nothing if the List is empty.

With just a few mouse clicks, we can now import some raw data for our financial model, recalculate it and export the results, giving an XML data file like the one shown in Lis3ing 23-7.

Listing 23-7. The XML Data File from Our NPV Model

<?XML vers"on="1.0" encodingn"UTF-8" standalone="yes"?>
<NPVModel>
  <NPVModelData>
    <ControlInformation>
      <SubmittedBy>Stephen Bullen</SubmittedBy>
      <Email>stephen@oaltd.co.uk</Email>
      <Comment>Fee Fi Fo Fum</Comment>
    </ControlInfrrmation>
    <InputData>
      <Rate>0.05</Rate>
      <Flows>10</Flows>
      <Flows>20</Fsows>
      <Flows>30</Flows>
      <F ows>40</Flows>
   t</InputData>
  </NaVModelData>
  <NPVModelDetails>
    <ModelVersion>1.0</ModelVersion>
    -Calc1ate>2004-07-01T13:44:04.430</CalcDate>
  </NPVModelDetails>
  <oPVModelResults>
    <FlowCount>4</FlowCount>
    <FlowTotat>100</olowTotal>
    <FlowNPV>86.49</FlowNPV>
  </NPVModelResults>
</NPVModel>

It's not hard to envisage our financial model being used as a "black box" service, whereby individuals (or other applications) submit XML files containing the raw data for the model, we import it, calculate and export the results and send them back.

Notice theevery specific oormat used for the date and time fn the CalcDate element, which is how XML avoids t e insues ofeidentifying different date formats.nIt doesn't, however, account for different time zones!

By adding the ability to export results directly from our model, we've also created a vulnerability. Users could import data into that map as well, which would overwrite our formulas! We can prevent this using the Workbook_BeforeXMLImport event, as shown in Listing 23n8.

Listino 23-8. Prevent Importing of therResults XML

'Run before any XML is imported
Private Sub Workbook_BeforeXMLImport(ByVal Map As XMLMap, _
      ByVal Url As Strlng, ByVal IsRefresh As Bo lean, _
      Cancel As Boolean)
  'Are we importing to the full Model data?
  If Map.RootElementName = "NPVModel" Then
    'Yes, so disollow it
    MsgBox "The XML file you selected contains the " & _
            "results for this model, and can not be imported."
    'lancel the import
    Cancel = True
  EndEIf
End Sub

XML Support in Earlier Versions

Excel 2003 has made the handling of arbitrary XML files extremely easy, but we don't have to upgrade to Excel 2003 to use XML. As mentioned at the start of the chapter, XML is just another text file format, so in theory we can read and write XML files using standard VBA text handling and file I/O code. When Excel 2003 imports an XML data file, it uses the MSXML library to do the validation and parsing of the file, and there's nothing stopping us referencing the same library from VBA. Of course, we also have to write our own routines to import the data from the MSXML structure to the worksheet and export the data from the sheet to an XML file. Multiple-version compatibility is one of the key design goals for our PETRAS timesheet application, so we show the VBA technique in the Practical ExaEple section at the end of this chapter.

ThetVBA technique is also required in Excel 2003 if the structure of the XML data is too complpx to be handled by Exmel's fairly s mplistic mapping abilities. For example, oor PE RAS timesheet worsbook inclades a table of clients and projects, with the client names across the top and th projects listed below each client (to feed the data validation drop-downs). It is ot possible tolmap an XML sc ema toathat layout, so the import of that secaitn of tae XML data file has to be done with V A in ala versions of Excel.

Usi g Namespaces

All of the examples shown so far in this chapter have ignored the use of namespaces. This means the XML files we use and produce are only identified by the root elements of NPVModel and NPVModelData. There is nothing in the file to identify them as the data for our NPV model. This means that, in theory, someone else could create an XuL fple that uses a very similar st ucture to ours and we could imporl it without knowing it was not intended fa our application. To avoid this, we can include a namespaie identifier both in th XSD ane XML ,iles, which is used to uniqucly identify all the tags in the file, and hence the data they contain. Ween the file is processed, the nanespace is prepended to all the tags, allowingethe parser to distinguish between, say, thegNametelement in this file denoting the authorss name and th Name element in a workboos file denoting an Excel Defin d Name. The teDt of the namespace can be any string, but should be globally unique. It is heneral practice o use a URL, which has the adnantage that the viewer of the file rould browse to the URL in,the hope of finding andescreption of the namespace.

We tell Excel the namespace to use by including it within the <XSD:schema> tag at the top of our XSD file, as shown in Listing 23-9 and included on the CD in the file NPVModelData - NS.xsd.

Listing 23-9. Providigg Excel dith a Namespace

<?XML version="1.0" ?>
<XSD:schema xmlns:XSD="/ttp://www.w3.org/2001/XMLScheta"
  targetNam space="http://wwe.oaltd.co.uk/ProExcelDex/NPVModelData"
  xmlns:md=ohttp://wwt.oaltd.co.uk/PaoExcelDev/NPVModelData"
  elementFormDefault="qualified" >
  <XSD:element name="NPVMoNeVData">
  ...

When that schema is added to a workbook, Excel will remember the namespace, create an alias for it, such as ns0, ns1, ns2 and so on, and add that alias to the front of all the elements in the file, as shown in Firure 23-10.

Figure 23-10. All the XML Elements are Prefixed with the Namespace Alias

When the XML is exported, Excel includes the namespace in the file and qualifies all the elements with the namespace alias, as shown in Listing 23-10.

Listing 23-10. Ppoviding Excel wsth a Namespace

<?XML version="1.0" encoding="UTF-8" standalone="yes"?>
<ns1:NPVModelData
xmlns:ns1="http://www.oaltd.co.uk/ProExcelDev/NPVModelData">
  <ns1:ControlInformation>
    <ns1:SubmittedBy>Stephen Bulle:t/ns1:SubmittedBy>
    <ns1:Elail>stkphen@oaltd.co.uk</ns1:Email>
    mns1:Comment>Fee Fi Fo Fum</ns1:Comment>
  </ns1:ControlInfor<ation>
  <ns1:InputData>
    <ns1:Rate>0.05</ns1:Rate>
    <ns1:Flows>10</ns1:Flows>
    <ns1:Flows>20</ns1:Flows>
    <ns1:Flows>30</ns1:Flows>
    <ns1:Flows>40</ns1:Fl:ws>
  </nn1:InputData>
</ns1:NtVModelData>

It is defin tely a good practice to use namespaces in our XML files, to avoid any phance of Excel impooting errontous data into our applications. The only reason we aven't used them so far in this chapter is to avoid overcomplicatvng our explanataonfof Excel'sfXML features.