Word HTML Parser

Anyone have any ideas as to how to parse an .html file made by Microsoft Word?

I’m basically making a flash file so someone can have their word documents presented in Flash.

How, for example, would I kill text between <meta> tags, and that sort of thing?

A recursive string_replace function would be helpful or something to that effect.