tidy::repairString

tidy_repair_string

(PHP 5, PHP 7, PHP 8, PECL tidy >= 0.7.0)

tidy::repairString -- tidy_repair_string β€” ВосстанавливаСт строку, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ ΠΏΠΎ возмоТности ΠΊΠΎΠ½Ρ„ΠΈΠ³ΡƒΡ€Π°Ρ†ΠΈΠΎΠ½Π½Ρ‹ΠΉ Ρ„Π°ΠΉΠ»

ОписаниС

ΠžΠ±ΡŠΠ΅ΠΊΡ‚Π½ΠΎ-ΠΎΡ€ΠΈΠ΅Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹ΠΉ ΡΡ‚ΠΈΠ»ΡŒ

public static function tidy::repairString(string $string, array|string|null $config = null, ?string $encoding = null): string|false

ΠŸΡ€ΠΎΡ†Π΅Π΄ΡƒΡ€Π½Ρ‹ΠΉ ΡΡ‚ΠΈΠ»ΡŒ

function tidy_repair_string(string $string, array|string|null $config = null, ?string $encoding = null): string|false

ВосстанавливаСт ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΡƒΡŽ строку.

Бписок ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ΠΎΠ²

string

Π”Π°Π½Π½Ρ‹Π΅ для восстановлСния.

config

Настройки config ΠΌΠΎΠ³ΡƒΡ‚ Π±Ρ‹Ρ‚ΡŒ Π·Π°Π΄Π°Π½Ρ‹ Π² Π²ΠΈΠ΄Π΅ массива ΠΈΠ»ΠΈ строки. Если Π·Π°Π΄Π°Π½Π° строка, Ρ‚ΠΎ ΠΎΠ½Π° интСрпрСтируСтся ΠΊΠ°ΠΊ имя Ρ„Π°ΠΉΠ»Π° ΠΊΠΎΠ½Ρ„ΠΈΠ³ΡƒΡ€Π°Ρ†ΠΈΠΈ, Π² ΠΏΡ€ΠΎΡ‚ΠΈΠ²Π½ΠΎΠΌ случаС, ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ интСрпрСтируСтся ΠΊΠ°ΠΊ сами настройки.

Π˜Π½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΡŽ ΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠΌ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Π΅ ΠΌΠΎΠΆΠ½ΠΎ Π½Π°ΠΉΡ‚ΠΈ Ρ‚ΡƒΡ‚: » http://api.html-tidy.org/#quick-reference.

encoding

ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ encoding устанавливаСт ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²ΠΊΡƒ для Π²Ρ…ΠΎΠ΄Π½Ρ‹Ρ…/Π²Ρ‹Ρ…ΠΎΠ΄Π½Ρ‹Ρ… Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚ΠΎΠ². Π’ΠΎΠ·ΠΌΠΎΠΆΠ½Ρ‹Π΅ значСния: ascii, latin0, latin1, raw, utf8, iso2022, mac, win1252, ibm858, utf16, utf16le, utf16be, big5, ΠΈ shiftjis.

Π’ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌΡ‹Π΅ значСния

Π’ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅Ρ‚ Π²ΠΎΡΡΡ‚Π°Π½ΠΎΠ²Π»Π΅Π½Π½ΡƒΡŽ строку ΠΈΠ»ΠΈ false, Ссли Π²ΠΎΠ·Π½ΠΈΠΊΠ»Π° ошибка.

Бписок измСнСний

ВСрсия ОписаниС
8.0.0 tidy::repairString() Ρ‚Π΅ΠΏΠ΅Ρ€ΡŒ статичный ΠΌΠ΅Ρ‚ΠΎΠ΄.
8.0.0 config ΠΈ encoding Ρ‚Π΅ΠΏΠ΅Ρ€ΡŒ Π΄ΠΎΠΏΡƒΡΠΊΠ°ΡŽΡ‚ Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ null.
8.0.0 Ѐункция большС Π½Π΅ ΠΏΡ€ΠΈΠ½ΠΈΠΌΠ°Π΅Ρ‚ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ useIncludePath.

ΠŸΡ€ΠΈΠΌΠ΅Ρ€Ρ‹

ΠŸΡ€ΠΈΠΌΠ΅Ρ€ #1 ΠŸΡ€ΠΈΠΌΠ΅Ρ€ использования tidy::repairString()

<?php
ob_start
();
?>

<html>
<head>
<title>тСст</title>
</head>
<body>
<p>ошибка</i>
</body>
</html>

<?php

$buffer
= ob_get_clean();
$tidy = new tidy();
$clean = $tidy->repairString($buffer);

echo
$clean;
?>

Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚ выполнСния ΠΏΡ€ΠΈΠ²Π΅Π΄Ρ‘Π½Π½ΠΎΠ³ΠΎ ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π°:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title>тСст</title>
</head>
<body>
<p>ошибка</p>
</body>
</html>

Π‘ΠΌΠΎΡ‚Ρ€ΠΈΡ‚Π΅ Ρ‚Π°ΠΊΠΆΠ΅

  • tidy::parseFile() - Π Π°Π·Π±ΠΈΡ€Π°Π΅Ρ‚ Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΡƒ Π² Ρ„Π°ΠΉΠ»Π΅ ΠΈΠ»ΠΈ URI-ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ‚ΠΎΡ€Π΅
  • tidy::parseString() - Π Π°Π·Π±ΠΎΡ€ Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Π°, Ρ…Ρ€Π°Π½ΠΈΠΌΠΎΠ³ΠΎ Π² строкС
  • tidy::repairFile() - ВосстанавливаСт Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΡƒ Ρ„Π°ΠΉΠ»Π° ΠΈ Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅Ρ‚ Π΅Π³ΠΎ Π² Π²ΠΈΠ΄Π΅ строки
οΌ‹Π”ΠΎΠ±Π°Π²ΠΈΡ‚ΡŒ

ΠŸΡ€ΠΈΠΌΠ΅Ρ‡Π°Π½ΠΈΡ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Π΅ΠΉ 3 notes

up
4
gnuffo1 at gmail dot com ΒΆ
15 years ago
You can also use this function to repair xml, for example if stray ampersands etc are breaking it:

<?php
$xml = tidy_repair_string($xml, array(
    'output-xml' => true,
    'input-xml' => true
));
?>
up
1
dan-dot-hunsaker-at-gmail-dot-com ΒΆ
14 years ago
The docs referenced at http://tidy.sourceforge.net/docs/quickref.html above state that the configuration option 'sort-attributes' is an enumeration of 'none' and 'alpha', thereby specifying that strings of either form are the acceptable values.  This may not be the case, however - on my system, the option was not honored until I set it to true.  This may also be the case with other options, so experiment a bit.  The output of tidy::getConfig() may be useful in this regard.
up
1
Romolo ΒΆ
9 years ago
Using tidy is very simple to fix a broken ods/odt document
I wrote the following code to be run from command line

<?php
$zip = new ZipArchive();
if ($zip->open($argv[1])) {
  $fp = $zip->getStream('content.xml'); //file inside archive
  if(!$fp)
    die("Error: can't get stream to document file");
  $stat = $zip->statName('content.xml');
  $buf = ""; //file buffer
  ob_start(); //to capture CRC error message
    while (!feof($fp)) {
      $buf .= fread($fp, 2048); 
    }
    $s = ob_get_contents();
  ob_end_clean();
  fclose($fp);
  $zip->close();
  $config = array(
      'indent' => true,
      'clean' => true,
      'input-xml'  => true,
      'output-xml' => true,
      'wrap'       => false
  );
  $tidy = new Tidy();
  $xml = $tidy->repairstring($buf, $config);
  $array=split("\n",$xml);
  $file=tempnam("/tmp","xml");
  $fp=fopen($file,"rw+");
  foreach ($array as $key=>$value) {
    fwrite($fp,trim($value),strlen(trim($value)));
    if ($key==0) {
      fwrite($fp,"\n");
    }
  }
  fclose($fp);
  if ($zip->open($argv[1]) === TRUE) {
    $zip->deleteName('content.xml');
    $zip->addFile($file, 'content.xml');
    $zip->close();
    echo 'recovery complete';
  } else {
    echo 'recovery failed';
  }
  unlink($file);
}
?>

save it to a file called fixdoc and invoke as:
php fixdoc yourbrokendoc

for your safety, please work on a copy of your doc.