OfficeIMO.Html
0.1.3
Prefix Reserved
dotnet add package OfficeIMO.Html --version 0.1.3
NuGet\Install-Package OfficeIMO.Html -Version 0.1.3
<PackageReference Include="OfficeIMO.Html" Version="0.1.3" />
<PackageVersion Include="OfficeIMO.Html" Version="0.1.3" />
<PackageReference Include="OfficeIMO.Html" />
paket add OfficeIMO.Html --version 0.1.3
#r "nuget: OfficeIMO.Html, 0.1.3"
#:package OfficeIMO.Html@0.1.3
#addin nuget:?package=OfficeIMO.Html&version=0.1.3
#tool nuget:?package=OfficeIMO.Html&version=0.1.3
OfficeIMO.Html
OfficeIMO.Html contains shared HTML ingestion primitives and first-party HTML bridge APIs used by OfficeIMO converters.
It owns the reusable parts that should behave consistently across HTML-to-Markdown, HTML-to-Word, HTML-to/from-RTF, and HTML-backed PDF workflows:
- URL policy evaluation and base URI resolution
- AngleSharp document parsing helpers
- DOM traversal facts and node/depth limit tracking
- image source discovery for
img, lazy-loading attributes,srcset, andpicture/source - image data URI parsing and media-type extension mapping
- semantic HTML to/from RTF conversion over the dependency-free
OfficeIMO.Rtfmodel
It does not replace output-specific engines. Markdown AST creation, Word document generation, RTF document generation, and PDF orchestration stay in their owning packages such as OfficeIMO.Markdown.Html, OfficeIMO.Word.Html, OfficeIMO.Rtf, and OfficeIMO.Html.Pdf.
RTF Bridge
using OfficeIMO.Html;
RtfDocument document = "<p>Hello <strong>RTF</strong></p>".ToRtfDocument();
string rtf = document.ToRtf();
string html = document.ToHtml();
RTF-to-RTF editing in OfficeIMO.Rtf remains the lossless preservation path. The HTML bridge is semantic: it preserves supported text, inline formatting, links, lists, tables, bookmarks, fields, form fields, notes, tracked revisions, object metadata, shape metadata, and embedded PNG/JPEG images without Office/COM automation.
URL Policy
var policy = HtmlUrlPolicy.CreateWebOnlyProfile();
string href = HtmlUrlPolicyEvaluator.ResolveUrl(
"/docs/start.html",
new Uri("https://example.com/"),
policy);
Parsing And Base URIs
var document = HtmlDocumentParser.ParseDocument(html);
Uri? baseUri = HtmlDocumentParser.ResolveEffectiveBaseUri(
document,
new Uri("https://example.com/articles/"));
Traversal Limits
HtmlDomLimitTracker? tracker = HtmlDomLimitTracker.Create(
maxHtmlNodes: 10000,
maxHtmlDepth: 64);
Converter packages use these primitives to keep bounded HTML ingestion behavior consistent while still reporting converter-specific diagnostics.
Shared Diagnostics And Gallery Contracts
var report = new HtmlDiagnosticReport();
report.Add("OfficeIMO.Word.Html", "HtmlCommentSkipped", "Comment skipped");
var scenario = new HtmlCapabilityGalleryScenario(
"quarterly-report",
"Quarterly Report",
"Word HTML",
"HTML import, DOCX validation, and round-trip export proof");
HtmlDiagnosticReport and the capability-gallery contracts provide a common shape for HTML converters, PDF bridges, readers, tests, and future documentation generators. Format-specific packages can keep their existing compatibility APIs while also publishing shared diagnostics and artifact metadata for market-facing proof galleries.
Conversion Document And Normalized HTML
var conversion = HtmlConversionDocumentBuilder.Build(html, new HtmlConversionDocumentOptions {
Profile = HtmlConversionProfile.Document,
BaseUri = new Uri("https://example.com/reports/"),
UrlPolicy = HtmlUrlPolicy.CreateWebOnlyProfile()
});
string normalized = conversion.NormalizedHtml;
var resources = conversion.ResourcePlan.GetSummary(HtmlResourceKind.Image);
var styles = conversion.StyleSummary;
HtmlConversionDocument is the shared conversion contract for OfficeIMO HTML workflows. It keeps the original source HTML together with the logical document model, computed-style summary, resource manifest, resource dependency plan, normalized HTML, and profile contract.
Target packages can accept this shared document while keeping target-specific rendering in their owning packages. For example, OfficeIMO.Word.Html can load it into WordDocument, OfficeIMO.Markdown.Html can render Markdown from it, and OfficeIMO.Html.Pdf can choose the matching bridge profile.
Normalized HTML output is policy-aware: URL-bearing attributes are resolved against the configured base URI, disallowed URLs are removed, boolean attributes are normalized, event-handler attributes are stripped by default, and non-document executable elements are skipped. It is intended for clean review, gallery proof, and downstream adapter input selection, not for pretending OfficeIMO is a browser layout engine.
Image Sources
string source = HtmlImageSourceResolver.ResolveImageSource(
imageElement,
baseUri,
HtmlUrlPolicy.CreateOfficeIMOProfile());
Image Data URIs
if (HtmlImageDataUri.TryParse(source, out var dataUri) && dataUri.IsBase64) {
byte[] bytes = dataUri.DecodeBytes();
string extension = dataUri.FileExtension;
}
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 is compatible. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETFramework 4.7.2
- AngleSharp (>= 1.5.1)
- AngleSharp.Css (>= 1.0.0-beta.216)
- OfficeIMO.Rtf (>= 0.1.2)
-
.NETStandard 2.0
- AngleSharp (>= 1.5.1)
- AngleSharp.Css (>= 1.0.0-beta.216)
- OfficeIMO.Rtf (>= 0.1.2)
-
net10.0
- AngleSharp (>= 1.5.1)
- AngleSharp.Css (>= 1.0.0-beta.216)
- OfficeIMO.Rtf (>= 0.1.2)
-
net8.0
- AngleSharp (>= 1.5.1)
- AngleSharp.Css (>= 1.0.0-beta.216)
- OfficeIMO.Rtf (>= 0.1.2)
NuGet packages (2)
Showing the top 2 NuGet packages that depend on OfficeIMO.Html:
| Package | Downloads |
|---|---|
|
OfficeIMO.Markdown.Html
HTML converter for OfficeIMO.Markdown - Convert HTML fragments or documents into OfficeIMO.Markdown documents and Markdown text. |
|
|
OfficeIMO.Word.Html
HTML converter for OfficeIMO.Word - Convert Word documents to/from HTML using AngleSharp |
GitHub repositories
This package is not used by any popular GitHub repositories.