You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 26, 2020. It is now read-only.
I have a fetched page by CURL, what charset is windows-1250, and doctype is
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
I change the encoding of my string, check it, and replace the meta charset in string:
$html = str_replace('windows-1250', 'UTF-8', mb_convert_encoding($result, 'UTF-8')); var_dump(mb_detect_encoding($html, "UTF-8, ASCII, ISO-8859-1, windows-1250")); $Doc = \phpQuery::newDocumentHTML($html, 'UTF-8'); echo pq($Doc)->html();
All the UTF-8 characters are messy. var_dump says, its UTF-8,
content-type="text/plain; charset=UTF-8"
.When I
var_dump($Doc);
I see, the DOMDocument encoding and xmlencoding values are nulls.But if I am using:
$Dom = new \DOMDocument(); $Dom->loadHTML($html);
and var_dump it, then everyhing is fine, the characters are ok.
I've checked the
createDocumentWrapper
and the$contentType
is ok.If I set the static $debug to true I've get this:
`string 'Load markup for content type text/html;charset=utf-8' (length=52)
string 'Loading HTML, content type 'text/html;charset=utf-8'' (length=52)
string 'Full markup load (HTML):
' (length=275)string 'DOC: UTF-8 REQ: UTF-8' (length=21)
string 'Full markup load (HTML), documentCreate('utf-8')' (length=48)
string 'Selecting document '52280a0c077ec7c5fb2f2350db12f22c' as default one' (length=68)`
The text was updated successfully, but these errors were encountered: