So I am parsing some XML and at the end of each node there are some tags I cannot remove…example:
<items>
<description>
<![CDATA[<P>Here is a description...</P>
<P><!--FLASH--></P>
<P>This course requires the latest FLASH Player &nbsp;&nbsp;&nbsp;</P>
<P></P>
<P><STRONG>CAUTION: Prior to downloading the FLASH Player, deselect the GoogleToolbar option</STRONG><BR><BR><BR><BR></P>
<P></P>]]>
</description>
<description>
<![CDATA[Here is a description...
<BR><!--FLASH-->
<BR>This course requires the latest FLASH Player &nbsp;&nbsp;&nbsp;
<P><STRONG>CAUTION: Prior to downloading the FLASH Player, deselect the GoogleToolbar option.</STRONG></P>]]>
</description>
<items>
I know the formatting of this thing is not good, however, I am not in charge of generating it so I cannot edit it directly. I can only use it to read from…
Now my question is, the company that makes this file added a <!–FLASH–> tag so I can parse through the description node and break at the <!–FLASH–> tag however there is some <BR> and <p> tags that I cannot strip because they are not always the same. This in turn adds line breaks to some of my descriptions…I don’t want to completely strip all tags because a description could have a link break in the middle of the description.
I want to just remove the tags in front of the <!–FLASH–> tag. Can anybody think of a solution?