Trying to read MS word contents using PHP

Hi all,

I’m trying to read MS word contents using fread or file_get_contents …
It works fine using both.

But my problem is explained in the attached file.

I want to ignore non English characters because they are always converted into strange chars.

This is my code :

function parseWord($userDoc) {$fileHandle = fopen($userDoc, "r");
$line =mb_convert_encoding( @fread($fileHandle, filesize($userDoc)) , "UTF-8");

$lines = explode(chr(0x0D),$line);$outtext = "";foreach($lines as $thisline){$pos = strpos($thisline, chr(0x00));if (($pos !== FALSE)||(strlen($thisline)==0)){} else {$outtext .= $thisline." ";}}$outtext = preg_replace("/[^a-zA-Z0-9\s\,\.\-
\r	@\/\_\(\)]/","",$outtext);return $outtext;} 

Can someone help?