DOMDocument::saveHTML

(PHP 5, PHP 7, PHP 8)

DOMDocument::saveHTML β€” БохраняСт Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ ΠΈΠ· Π²Π½ΡƒΡ‚Ρ€Π΅Π½Π½Π΅Π³ΠΎ прСдставлСния Π² строку, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ HTML

ОписаниС

public function DOMDocument::saveHTML(?DOMNode $node = null): string|false

Π‘ΠΎΠ·Π΄Π°Ρ‘Ρ‚ HTML-Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ ΠΈΠ· прСдставлСния DOM. Π­Ρ‚Ρƒ Ρ„ΡƒΠ½ΠΊΡ†ΠΈΡŽ ΠΎΠ±Ρ‹Ρ‡Π½ΠΎ Π²Ρ‹Π·Ρ‹Π²Π°ΡŽΡ‚ послС построСния Π½ΠΎΠ²ΠΎΠ³ΠΎ DOM-Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Π°, ΠΊΠ°ΠΊ ΠΏΠΎΠΊΠ°Π·Π°Π½ΠΎ Π² ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ Π½ΠΈΠΆΠ΅.

Бписок ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ²

node

ΠΠ΅ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹ΠΉ Π°Ρ€Π³ΡƒΠΌΠ΅Π½Ρ‚ для Π²Ρ‹Π²ΠΎΠ΄Π° подмноТСства Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Π°.

Π’ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌΡ‹Π΅ значСния

Π’ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅Ρ‚ HTML ΠΈΠ»ΠΈ false Π² случаС возникновСния ошибки.

ΠŸΡ€ΠΈΠΌΠ΅Ρ€Ρ‹

ΠŸΡ€ΠΈΠΌΠ΅Ρ€ #1 Π‘ΠΎΡ…Ρ€Π°Π½Π΅Π½ΠΈΠ΅ HTML-Π΄Π΅Ρ€Π΅Π²Π° Π² Π²ΠΈΠ΄Π΅ строки

<?php

$doc
= new DOMDocument('1.0');

$root = $doc->createElement('html');
$root = $doc->appendChild($root);

$head = $doc->createElement('head');
$head = $root->appendChild($head);

$title = $doc->createElement('title');
$title = $head->appendChild($title);

$text = $doc->createTextNode('Π­Ρ‚ΠΎ Π·Π°Π³ΠΎΠ»ΠΎΠ²ΠΎΠΊ');
$text = $title->appendChild($text);

echo
$doc->saveHTML();

?>

Π‘ΠΌΠΎΡ‚Ρ€ΠΈΡ‚Π΅ Ρ‚Π°ΠΊΠΆΠ΅

  • DOMDocument::saveHTMLFile() - БохраняСт Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ ΠΈΠ· Π²Π½ΡƒΡ‚Ρ€Π΅Π½Π½Π΅Π³ΠΎ прСдставлСния Π² Ρ„Π°ΠΉΠ», ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ HTML
  • DOMDocument::loadHTML() - Π—Π°Π³Ρ€ΡƒΠ·ΠΊΠ° HTML ΠΈΠ· строки
  • DOMDocument::loadHTMLFile() - Π—Π°Π³Ρ€ΡƒΠ·ΠΊΠ° HTML ΠΈΠ· Ρ„Π°ΠΉΠ»Π°
οΌ‹Π”ΠΎΠ±Π°Π²ΠΈΡ‚ΡŒ

ΠŸΡ€ΠΈΠΌΠ΅Ρ‡Π°Π½ΠΈΡ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Π΅ΠΉ 12 notes

up
62
tomas dot strejcek at ghn dot cz ΒΆ
9 years ago
As of PHP 5.4 and Libxml 2.6, there is currently simpler approach:

when you load html as this

$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

in the output, there will be no doctype, html or body tags
up
33
sasha @ goldnet dot ca ΒΆ
8 years ago
When saving HTML fragment initiated with LIBXML_HTML_NOIMPLIED option, it will end up being "broken" as libxml requires root element. libxml will attempt to fix the fragment by adding closing tag at the end of string based on the first opened tag it encounters in the fragment. 

For an example:

<h1>Foo</h1><p>bar</p>

will end up as:

<h1>Foo<p>bar</p></h1>

Easiest workaround is adding root tag yourself and stripping it later:

$html->loadHTML('<html>' . $content .'</html>', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$content = str_replace(array('<html>','</html>') , '' , $html->saveHTML());
up
3
jeboy ΒΆ
8 years ago
LIBXML_HTML_NOIMPLIED doesn't work on PHP 7.1.9 with libxml2-2.7.8
up
10
contact at cathexis dot de ΒΆ
9 years ago
If you load HTML from a string ensure the charset is set.

<?php
...
$html_src = '<html><head><meta content="text/html; charset=utf-8" http-equiv="Content-Type"></head><body>';
$html_src .= '...';
...
?> 

Otherwise the charset will be ISO-8859-1!
up
2
Anonymous ΒΆ
17 years ago
To avoid script tags from being output as <script />, you can use the DOMDocumentFragment class:

<?php

$doc = new DOMDocument();
$doc -> loadXML($xmlstring);
$fragment = $doc->createDocumentFragment();
/* Append the script element to the fragment using raw XML strings (will be preserved in their raw form) and if succesful proceed to insert it in the DOM tree */ 
if($fragment->appendXML("<script type='text/javascript' src='$source'></script>") { 
  $xpath = new DOMXpath($doc);
  $resultlist = $xpath->query("//*[local-name() = 'html']/*[local-name() = 'head']"); /* namespace-safe method to find all head elements which are childs of the html element, should only return 1 match */
  foreach($resultlist as $headnode)  // insert the script tag
     $headnode->appendChild($fragment);
}
$doc->saveXML(); /* and our script tags will still be <script></script> */

?>
up
1
Anonymous ΒΆ
16 years ago
If you want a simpler way to get around the <script> tag problem try:

<?php

  $script = $doc->createElement ('script');\
  // Creating an empty text node forces <script></script>
  $script->appendChild ($doc->createTextNode (''));
  $head->appendChild ($script);

?>
up
0
Anonymous ΒΆ
10 years ago
To solve the script tag problem just add an empty text node to the script node and DOMDocument will render <script src="your.js"></script> nicely.
up
2
xoplqox ΒΆ
18 years ago
XHTML:

If the output is XHTML use the function saveXML().

Output example for saveHTML:

<select name="pet" size="3" multiple>
    <option selected>mouse</option>
    <option>bird</option>
    <option>cat</option>
</select>

XHTML conform output using saveXML:

<select name="pet" size="3" multiple="multiple">
    <option selected="selected">mouse</option>
    <option>bird</option>
    <option>cat</option>
</select>
up
1
archanglmr at yahoo dot com ΒΆ
18 years ago
If created your DOMDocument object using loadHTML() (where the source is from another site) and want to pass your changes back to the browser you should make sure the HTTP Content-Type header matches your meta content-type tags value because modern browsers seem to ignore the meta tag and trust just the HTTP header. For example if you're reading an ISO-8859-1 document and your web server is claiming UTF-8 you need to correct it using the header() function.

<?php
header('Content-Type: text/html; charset=iso-8859-1');
?>
up
0
tyson at clugg dot net ΒΆ
21 years ago
<?php
// Using DOM to fix sloppy HTML.
// An example by Tyson Clugg <tyson@clugg.net>
//
// vim: syntax=php expandtab tabstop=2

function tidyHTML($buffer)
{
  // load our document into a DOM object
  $dom = @DOMDocument::loadHTML($buffer);
  // we want nice output
  $dom->formatOutput = true;
  return($dom->saveHTML());
}

// start output buffering, using our nice
// callback funtion to format the output.
ob_start("tidyHTML");

?>
<html>
<p>It's like comparing apples to oranges.
</html>
<?php

// this will be called implicitly, but we'll
// call it manually to illustrate the point.
ob_end_flush();

?>

The above code takes out sloppy HTML:
 <html>
 <p>It's like comparing apples to oranges.
 </html>

And cleans it up to the following:
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
 <html><body><p>It's like comparing apples to oranges.
 </p></body></html>
up
-3
Anonymous ΒΆ
18 years ago
<?php
function getDOMString($retNode) {
  if (!$retNode) return null;
  $retval = strtr($retNode-->ownerDocument->saveXML($retNode),
  array(
    '></area>' => ' />',
    '></base>' => ' />',
    '></basefont>' => ' />',
    '></br>' => ' />',
    '></col>' => ' />',
    '></frame>' => ' />',
    '></hr>' => ' />',
    '></img>' => ' />',
    '></input>' => ' />',
    '></isindex>' => ' />',
    '></link>' => ' />',
    '></meta>' => ' />',
    '></param>' => ' />',
    'default:' => '', 
    // sometimes, you have to decode entities too...
    '&quot;' => '&#34;',
    '&amp;' =>  '&#38;',
    '&apos;' => '&#39;',
    '&lt;' =>   '&#60;',
    '&gt;' =>   '&#62;',
    '&nbsp;' => '&#160;',
    '&copy;' => '&#169;',
    '&laquo;' => '&#171;',
    '&reg;' =>   '&#174;',
    '&raquo;' => '&#187;',
    '&trade;' => '&#8482;'
  ));
  return $retval;
}
?>
up
-4
qrworld.net ΒΆ
11 years ago
In this post http://softontherocks.blogspot.com/2014/11/descargar-el-contenido-de-una-url_11.html I found a simple way to get the content of a URL with DOMDocument, loadHTMLFile and saveHTML().

function getURLContent($url){
    $doc = new DOMDocument;
    $doc->preserveWhiteSpace = FALSE;
    @$doc->loadHTMLFile($url);
    return $doc->saveHTML();
}