This is one of the simplest regex to remove html tags from some html text. I know its not the best but i’d argue that it’s one of the simplest. ;)
public static string RemoveHtml(string txt)
{
return Regex.Replace(txt, @"<[^>]*>", "");
}
Filed under: Technical Tagged: | .net, html, regex, tags, xml
Just an FYI for those reading this. This regex does not take into account the possibility of a “>” symbol in an attribute within a tag.
For example the following:
1″ src=”test.jpg”>
Would result in:
1″ src =”test.jpg”>
Sorry the markup was stripped from the previous comment using the technique described above. Subsequently I’ll repost again escaped:
Just an FYI for those reading this. This regex does not take into account the possibility of a “>” symbol in an attribute within a tag.
For example the following:
<img alt="2 > 1" src="test.jpg">
For example the following:
1" src="test.jpg">