Vereyon.Web.HtmlSanitizer
1.8.0
dotnet add package Vereyon.Web.HtmlSanitizer --version 1.8.0
NuGet\Install-Package Vereyon.Web.HtmlSanitizer -Version 1.8.0
<PackageReference Include="Vereyon.Web.HtmlSanitizer" Version="1.8.0" />
paket add Vereyon.Web.HtmlSanitizer --version 1.8.0
#r "nuget: Vereyon.Web.HtmlSanitizer, 1.8.0"
// Install Vereyon.Web.HtmlSanitizer as a Cake Addin #addin nuget:?package=Vereyon.Web.HtmlSanitizer&version=1.8.0 // Install Vereyon.Web.HtmlSanitizer as a Cake Tool #tool nuget:?package=Vereyon.Web.HtmlSanitizer&version=1.8.0
HtmlRuleSanitizer
HtmlRuleSanitizer is a white list rule based HTML sanitizer built on top of the HTML Agility Pack. Use it to cleanup HTML and removing malicious content.
var sanitizer = HtmlSanitizer.SimpleHtml5Sanitizer();
string cleanHtml = sanitizer.Sanitize(dirtyHtml);
Without configuration HtmlRuleSanitizer will strip absolutely everything. This ensures that you are in control of what HTML is getting through. It was inspired by the client side parser of the wysihtml5 editor.
Use cases
HtmlRuleSanitizer was designed with the following use cases in mind:
- Prevent cross-site scripting (XSS) attacks by removing javascript and other malicious HTML fragments.
- Restrict HTML to simple markup in order to allow for easy transformation to other document types without having to deal with all possible HTML tags.
- Enforce nofollow on links to discourage link spam.
- Cleanup submitted HTML by removing empty tags for example.
- Restrict HTML to a limited set of tags, for example in a comment system.
Features
- CSS class white listing
- Empty tag removal
- Tag white listing
- Tag attribute and CSS class enforcement
- Tag flattening to simplify document structure while maintaining content
- Tag renaming
- Attribute checks (e.g. URL validity) and white listing
- Attribute quote normalization
- A fluent style configuration interface
- HTML entity encoding
- Comment removal
Usage
Install the HtmlRuleSanitizer NuGet package.
Optionally add the following using
statement in the file where you intend to use HtmlRuleSanitizer:
using Vereyon.Web;
Basic usage
var sanitizer = HtmlSanitizer.SimpleHtml5Sanitizer();
string cleanHtml = sanitizer.Sanitize(dirtyHtml);
Note: the SimpleHtml5Sanitizer returns a rule set which does not allow for a full document definition. Use SimpleHtml5DocumentSanitizer
Sanitize a document
When dealing with full HTML documents including the html
and body
tags, use SimpleHtml5DocumentSanitizer
:
var sanitizer = HtmlSanitizer.SimpleHtml5DocumentSanitizer();
string cleanHtml = sanitizer.Sanitize(dirtyHtml);
Configuration
The code below demonstrates how to configure a rule set which only allows strong, i and a tags and which enforces the link tags to have a valid url, be no-follow and open in a new window. In addition, any b tag is renamed to strong because they more or less do the same anyway and b is deprecated. Any empty tags are removed to get rid of them. This would be a nice example for comment processing.
var sanitizer = new HtmlSanitizer();
sanitizer.Tag("strong").RemoveEmpty();
sanitizer.Tag("b").Rename("strong").RemoveEmpty();
sanitizer.Tag("i").RemoveEmpty();
sanitizer.Tag("a").SetAttribute("target", "_blank")
.SetAttribute("rel", "nofollow")
.CheckAttributeUrl("href")
.RemoveEmpty();
string cleanHtml = sanitizer.Sanitize(dirtyHtml);
CSS class shitelisting
Global CSS class whitelisting is achieved as follows where CSS classes are space separated:
sanitizer.AllowCss("legal also-legal");
Custom attribute sanitization
Attribute sanitization can be peformed by implementing a custom IHtmlAttributeSanitizer
. The code below illustrates a simple custom sanitizer which overrides the attribute value:
class CustomSanitizer : IHtmlAttributeSanitizer
{
public SanitizerOperation SanitizeAttribute(HtmlAttribute attribute, HtmlSanitizerTagRule tagRule)
{
// Override the attribute value and leave the attribute as be.
attribute.Value = "123";
return SanitizerOperation.DoNothing;
}
}
The custom sanitizer can then be assigned to the desired attributes as follows:
var sanitizer = new HtmlSanitizer();
var attributeSanitizer = new CustomSanitizer();
sanitizer.Tag("span").SanitizeAttributes("style", attributeSanitizer);
Custom element sanitization
Element sanitization can be performed by implement a customer IHtmlElementSanitizer
, much like custom attribute sanitization.
The code below illustrates a custom sanitizer which will remove span
elements which contain the text "remove me":
var sanitizer = new HtmlSanitizer();
sanitizer.Tag("span").Sanitize(new CustomSanitizer(element =>
{
return element.InnerText == "remove me"
? SanitizerOperation.RemoveTag
: SanitizerOperation.DoNothing;
}));
Contributing
Contributions are welcome through a GitHub pull request.
Setup
dotnet restore
Tests
Got tests? Yes, see the tests project. It uses xUnit.
cd Web.HtmlSanitizer.Tests/
dotnet test
More information
License
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework | net461 was computed. net462 is compatible. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 is compatible. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETFramework 4.6.2
- HtmlAgilityPack (>= 1.7.1)
-
.NETFramework 4.8
- HtmlAgilityPack (>= 1.7.1)
-
.NETStandard 2.0
- HtmlAgilityPack (>= 1.7.1)
-
.NETStandard 2.1
- HtmlAgilityPack (>= 1.7.1)
-
net6.0
- HtmlAgilityPack (>= 1.7.1)
-
net7.0
- HtmlAgilityPack (>= 1.7.1)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Vereyon.Web.HtmlSanitizer:
Package | Downloads |
---|---|
SuperiorAcumaticaPackage
Dependencies required to compile the SuperiorAcumaticaSolution for Acumatica 2024 R2 Build 24.204.0004 |
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
1.8.0 | 86,930 | 10/28/2023 |
1.7.1 | 17,208 | 8/6/2023 |
1.7.0 | 51,244 | 4/8/2023 |
1.6.0.1 | 377,001 | 2/6/2021 |
1.6.0 | 20,426 | 4/19/2020 |
1.6.0-beta1 | 433 | 3/29/2020 |
1.5.1 | 72,907 | 10/27/2019 |
1.5.0 | 33,138 | 12/26/2018 |
1.4.0 | 31,993 | 12/26/2017 |
1.3.1.1 | 91,132 | 9/26/2017 |
1.3.1 | 18,833 | 6/14/2017 |
1.3.0 | 3,440 | 1/15/2017 |
1.2.1 | 1,191 | 1/15/2017 |
1.2.0 | 5,111 | 8/15/2016 |
1.1.4 | 2,699 | 5/13/2016 |
1.1.3 | 8,165 | 1/15/2016 |
1.1.2 | 3,212 | 11/7/2015 |
1.1.1 | 2,049 | 7/8/2015 |
1.1.0 | 1,353 | 6/18/2015 |
1.0.0 | 1,643 | 5/3/2015 |
This release adds support for custom element sanitizers, implements attribute quote normalization and enables nullable references types to reduce the chances of NullReferenceExceptions.