-
|
Suppose my users can post comments, but they must not have HTML or anything other than plain text. I did this: var s = new HtmlSanitizer();
s.AllowedTags.Clear();If the input is: How can I configure it so it gives me: I realise this creates a problem if the input is For example in HtmlRuleSanitizer there is a "tag flattening" feature. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
For your specific use case I would suggest using AngleSharp directly instead of HtmlSanitizer: var parser = new HtmlParser();
var html = "foo bar <span>123</span> baz <script>alert('xss')</script> qux";
var doc = parser.ParseDocument(html);
var text = doc.Body.TextContent;
// "foo bar 123 baz alert('xss') qux" |
Beta Was this translation helpful? Give feedback.
-
|
If you really wanted to use HtmlSanitizer to accomplish this, I have this in a utility class (for pretty much the same reason): /// <summary>
/// Sanitizes the specified HTML body fragment. Allows no markup.
/// </summary>
/// <param name="markup">An HTML body fragment.</param>
/// <param name="keepText">Whether or not to retain the text content from removed markup.</param>
/// <returns>The sanitized HTML body fragment.</returns>
public static string WhitewashMarkup(string markup, bool keepText = false)
{
if (!string.IsNullOrWhiteSpace(markup))
{
var options = new HtmlSanitizerOptions
{
AllowedTags = new HashSet<string>()
};
var sanitizer = new HtmlSanitizer(options)
{
KeepChildNodes = keepText
};
markup = sanitizer.Sanitize(markup);
}
return markup;
}I'm sure there's easier ways, but this has worked for me for a while now. |
Beta Was this translation helpful? Give feedback.
For your specific use case I would suggest using AngleSharp directly instead of HtmlSanitizer: