php - encoding issues with gbk pages , domxpath -
i'm trying curl link below in gbk. want extract title of product , image. when echo document test if it's working , dont chinese character. need extract using domxpath , display characters on website, same characters , not weird characters. how work?
$ch = curl_init("http://item.taobao.com/item.htm?spm=a2106.m874.1000384.41.ag3kbi&id=20811635147&_u=o1ffj7oi9ad3&scm=1029.newlist-0.1.16&ppath=&sku="); curl_setopt($ch, curlopt_returntransfer, true); curl_setopt($ch, curlopt_binarytransfer, true); $content = curl_exec($ch); curl_close($ch); $doc = new domdocument(); $searchpage = mb_convert_encoding($content, 'utf-8', "auto"); $doc->loadhtml($searchpage); echo $doc->savehtml();
check if mbstring.language in php.ini set gbk, or explicitly use
$searchpage = mb_convert_encoding($content, 'utf-8', "gb18030");
Comments
Post a Comment