MarkItDown 0.0.1
dotnet add package MarkItDown --version 0.0.1
NuGet\Install-Package MarkItDown -Version 0.0.1
<PackageReference Include="MarkItDown" Version="0.0.1" />
<PackageVersion Include="MarkItDown" Version="0.0.1" />
<PackageReference Include="MarkItDown" />
paket add MarkItDown --version 0.0.1
#r "nuget: MarkItDown, 0.0.1"
#addin nuget:?package=MarkItDown&version=0.0.1
#tool nuget:?package=MarkItDown&version=0.0.1
MarkItDown.Net
A .NET library for converting various file formats to Markdown, making it ideal for indexing, text analysis, and other applications that benefit from structured text. This project builds upon the early work of MarkItDownSharp.
Supported Formats
- Word (.docx)
- Excel (.xlsx)
- Images (EXIF metadata extraction and optional LLM-based description)
- Audio (EXIF metadata extraction only)
- HTML
- Text-based formats (plain text, .csv, .xml, .rss, .atom)
- Jupyter Notebooks (.ipynb)
- Bing Search Result Pages (SERP)
- ZIP files (recursively iterates over contents)
- PowerPoint (.pptx)
- Confluence (spaces and single pages)
Features
- Modern .NET implementation
- Enhanced performance and reliability
- Expanded format support
- Improved error handling
- Comprehensive documentation
- Extensible third-party service integration
Third-Party Services Support
The library is designed with extensibility in mind, allowing integration with various third-party services for enhanced functionality.
Currently Supported Services
- Aliyun OCR Service
- Document text recognition
- Table structure recognition
- Handwriting recognition
- More OCR capabilities based on Aliyun's offerings
Adding New Services
The library provides a flexible plugin architecture for adding new service integrations. Documentation for implementing custom service providers will be available soon.
Getting Started
[Coming soon]
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Acknowledgments
This project is based on the early work of MarkItDownSharp by kelter-antunes. We are grateful for their initial implementation which provided a solid foundation for this enhanced version.
License
[License information coming soon]
Note
Speech Recognition for audio converter is planned for future implementation. Contributions in this area are especially welcome.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
-
net9.0
- ClosedXML (>= 0.104.2)
- ClosedXML.Parser (>= 1.3.0)
- DocumentFormat.OpenXml (>= 3.2.0)
- HtmlAgilityPack (>= 1.11.72)
- Markdig (>= 0.40.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.4)
- NAudio (>= 2.2.1)
- Newtonsoft.Json (>= 13.0.3)
- PdfPig (>= 0.1.9)
- ReverseMarkdown (>= 4.6.0)
- SharpZipLib (>= 1.4.2)
- SixLabors.Fonts (>= 1.0.1)
- TagLibSharp (>= 2.3.0)
- YoutubeExplode (>= 6.3.13)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on MarkItDown:
Package | Downloads |
---|---|
MarkItDown.Extensions.AliyunOCR
Package Description |
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
0.0.1 | 169 | 4/22/2025 |