Zend PHP 7 Certification – Data Formats and Types – XML Extension

This post covers the XML Extension section of the Data Formats and Types chapter when studying for the Zend PHP 7 Certification.

The XML Parser extension in PHP is enabled by default, however it may be disabled by using the --disable-xml option at compile time.

This extension also requires the libxml PHP extension, which is also usually enabled by default.

Some useful XML Parser functions include xml_parser_create(), which is used to create an XML parser.

$xmlparser = xml_parser_create();

The function can take one parameter which is the encoding as a string. The default is UTF-8, however you can choose others such as ISO-8859-1 or US-ASCII

You can also ‘free’ an XML parser by using the xml_parser_free() function.

$xmlparser = xml_parser_create();

xml_parser_free($xmlparser);

The xml_parser_create_ns() function is similar to xml_parser_create(), except it creates a new XML parser with XML namespace support.

$xmlparser = xml_parser_create_ns();

It also takes the optional encoding parameter. An optional second parameter, the separator, can also be specified which specifies the output separator for tag name and namespace.

The xml_parse() function is responsible for parsing an XML document. It takes three parameters.

  • parser – A reference to the XML parser to use.
  • data – Chunk of data to parse
  • is_final – If set and TRUE, the XML in the data parameter is the last piece of data sent in this parse.

As an example, a simple XML document may look like the below.

// products.xml

<?xml version="1.0" encoding="UTF-8"?>
<productlist>
    <products>
        <apple>
            <description>Tasty!</description>
            <price>40</price>
        </apple>
        <orange>
            <description>Delicious!</description>
            <price>50</price>
        </orange>
    </products>
</productlist>

The PHP file could look like the following.

$parser = xml_parser_create();

function char($parser,$data) {
    echo $data;
}

xml_set_character_data_handler($parser,"char");

$fp = fopen("products.xml","r");

while ($data = fread($fp, 4096))  {
    xml_parse($parser, $data, feof($fp)) or
    die (sprintf("XML Error: %s at line %d",
        xml_error_string(xml_get_error_code($parser)),
        xml_get_current_line_number($parser)));
}

xml_parser_free($parser);

This outputs:

Tasty! 40 Delicious! 50

Note the xml_set_character_data_handler() function used here. The xml_set_character_data_handler() function sets the character data handler for the XML parser. This function specifies what function to be called when the parser finds character data in the XML file. This function returns TRUE on success, or FALSE on failure.

Also note the error functions that are being used here. The xml_error_string() function gets the XML parser error string. The functions xml_get_error_code() and xml_get_current_line_number() get the error code and line number respectively. Both of these functions require the XML parser passed in as a parameter.

So if we were to make a deliberate error in the XML, we would get the error code and the line number in the output, like the following.

XML Error: not well-formed (invalid token) at line 9

The xml_set_element_handler() function specifies functions to be called at the start and end of an element in the XML document.

It takes three parameters, the parser, the start function name and the end function name.

So if we modify our XML to look like the following:

// products.xml
<?xml version="1.0" encoding="UTF-8"?>
<productlist>
    <products>
        <product>
            <name>Apple</name>
            <description>Tasty!</description>
            <price>40</price>
        </product>
        <product>
            <name>Orange</name>
            <description>Delicious!</description>
            <price>50</price>
        </product>
    </products>
</productlist>

And add in the xml_set_element_handler() function within the PHP file.

$parser = xml_parser_create();

function char($parser,$data) {
    echo $data;
}

function start($parser,$element_name,$element_attrs)
{
    switch($element_name)
    {
        case "PRODUCT":
            echo "-- Product --<br />";
            break;
        case "NAME":
            echo "Name: ";
            break;
        case "DESCRIPTION":
            echo "Description: ";
            break;
        case "PRICE":
            echo "Price: ";
            break;
    }
}

function stop($parser,$element_name)
{
    echo "<br />";
}

xml_set_element_handler($parser,"start","stop");
xml_set_character_data_handler($parser,"char");

$fp = fopen("products.xml","r");

while ($data = fread($fp, 4096))  {
    xml_parse($parser, $data, feof($fp)) or
    die (sprintf("XML Error: %s at line %d",
        xml_error_string(xml_get_error_code($parser)),
        xml_get_current_line_number($parser)));
}

xml_parser_free($parser);

This outputs:

-- Product --
Name: Apple
Description: Tasty!
Price: 40

-- Product --
Name: Orange
Description: Delicious!
Price: 50

You may notice that the elements within the switch statements are uppercase. This is because of case-folding that is enabled by default for an XML parser.

You can get and set option for an XML parser using the xml_parser_get_option() and xml_parser_set_option() functions.

For example, to disable the case-folding option, you can do the following.

$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);

View the other sections:

Note: This article is based on PHP version 7.0.