PHP XML Extracting Information Using XPath - Web Development and Design | Tutorial for Java, PHP, HTML, Javascript PHP XML Extracting Information Using XPath - Web Development and Design | Tutorial for Java, PHP, HTML, Javascript

Breaking

Post Top Ad

Post Top Ad

Saturday, June 15, 2019

PHP XML Extracting Information Using XPath

PHP XML



Extracting Information Using XPath

Problem

You want to make sophisticated queries of your XML data without parsing the document node by node.

Solution

Use XPath.

XPath is available in SimpleXML:

           $s = simplexml_load_file(__DIR__ . '/address-book.xml');
           $emails = $s->xpath('/address-book/person/email');

           foreach ($emails as $email) {
                   // do something with $email
           }

And in DOM:

           $dom = new DOMDocument;
           $dom->load(__DIR__ . '/address-book.xml');
           $xpath = new DOMXPath($dom);
           $emails = $xpath->query('/address-book/person/email');

           foreach ($emails as $email) {
                   // do something with $email
           }

Discussion

Except for the simplest documents, it’s rarely easy to access the data you want one element at a time. As your XML files become increasingly complex and your parsing desires grow, using XPath is easier than filtering the data inside a foreach.

PHP has an XPath class that takes a DOM object as its constructor. You can then search the object and receive DOM nodes in reply. SimpleXML also supports XPath, and it’s easier to use because it’s integrated into the SimpleXML object.

DOM supports XPath queries, but you do not perform the query directly on the DOM object itself. Instead, you create a DOMXPath object, as shown:

           $dom = new DOMDocument;
           $dom->load(__DIR__ . '/address-book.xml');
           $xpath = new DOMXPath($dom);
           $emails = $xpath->query('/address-book/person/email');

Instantiate DOMXPath by passing in a DOMDocument to the constructor. To execute the XPath query, call query() with the query text as your argument. This returns an iterable DOM node list of matching nodes:

           $dom = new DOMDocument;
           $dom->load(__DIR__ . '/address-book.xml');
           $xpath = new DOMXPath($dom);
           $emails = $xpath->query('/address-book/person/email');

           foreach ($emails as $e) {
                    $email = $e->firstChild->nodeValue;
                    // do something with $email
           }

After creating a new DOMXPath object, query this object using DOMXPath::query(), passing the XPath query as the first parameter (in this example, it’s /people/person/email). This function returns a node list of matching DOM nodes.

By default, DOMXPath::query() operates on the entire XML document. Search a subsection of the tree by passing in the subtree as a final parameter to query(). For instance, to gather all the first and last names of people in the address book, retrieve all the person nodes and query each node individually:

           $dom = new DOMDocument;
           $dom->load(__DIR__ . '/address-book.xml');
           $xpath = new DOMXPath($dom);
           $people = $xpath->query('/address-book/person');

           foreach ($people as $p) {
                    $fn = $xpath->query('firstname', $p);
                    $firstname = $fn->item(0)->firstChild->nodeValue;

                    $ln = $xpath->query('lastname', $p);
                    $lastname = $ln->item(0)->firstChild->nodeValue;

                    print "$firstname $lastname\n";
           }

Inside the foreach, call DOMXPath::query() to retrieve the firstname and lastname nodes. Now, in addition to the XPath query, also pass $p to the method. This makes the search local to the node.

In contrast to DOM, all SimpleXML objects have an integrated xpath() method. Calling this method queries the current object using XPath and returns a SimpleXML object containing the matching nodes, so you don’t need to instantiate another object to use XPath. The method’s one argument is your XPath query.

Here’s how to find all the matching email addresses in the sample address book:

           $s = simplexml_load_file(__DIR__ . '/address-book.xml');
           $emails = $s->xpath('/address-book/person/email');

           foreach ($emails as $email) {
                    // do something with $email
           }

This is shorter because there’s no need to dereference the firstChild or to take the nodeValue. 

SimpleXML handles the more complicated example, too. Because xpath() returns SimpleXML objects, you can query them directly:

           $s = simplexml_load_file(__DIR__ . '/address-book.xml');
           $people = $s->xpath('/address-book/person');

           foreach($people as $p) {
                   list($firstname) = $p->xpath('firstname');
                   list($lastname) = $p->xpath('lastname');

                   print "$firstname $lastname\n";
           }

           David Sklar
           Adam Trachtenberg

Because the inner XPath queries return only one element, use list to grab it from the array.



No comments:

Post a Comment

Post Top Ad