Count the number of words in a document using PHP

Im trying to count the number of words in a document using PHP.
I already have a function that returns the number of words in a string:


function wordCount($string){   
     $words = "";   
     $string = eregi_replace(" +", " ", $string);   
     $array = explode(" ", $string);   
     for($i=0;$i < count($array);$i++) {   
         if (eregi("[0-9A-Za-zÀ-ÖØ-öø-ÿ]", $array[$i])) $words[$i] = $array[$i];   
  }   
     return $words;   
 }  

So I thought I could try to read the file and put the content in a string but its not working.

BTW, the documents Im trying to do this with, are Word, Excel and PDF…

Any ideas on how to do this???

Thanks.

Andres:

Did you find a solution for this counting problem? Would you give me a hint please? I am also having the same challenge now.

Thanks,

jmam.

jmam the word document is just a zip archive. Simply change the docx to a zip and open it and find which file holds the content (I forget which one).

After that use the zip api to open it into memory and read the xml document and grab the nodes content.

Hope that helps you out.