DocumentTextExtractor 1.0.4
There is a newer version of this package available.
See the version list below for details.
See the version list below for details.
dotnet add package DocumentTextExtractor --version 1.0.4
NuGet\Install-Package DocumentTextExtractor -Version 1.0.4
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="DocumentTextExtractor" Version="1.0.4" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add DocumentTextExtractor --version 1.0.4
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: DocumentTextExtractor, 1.0.4"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install DocumentTextExtractor as a Cake Addin #addin nuget:?package=DocumentTextExtractor&version=1.0.4 // Install DocumentTextExtractor as a Cake Tool #tool nuget:?package=DocumentTextExtractor&version=1.0.4
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
DocumentTextExtractor
Simple C# library for extracting text and metadata from .docx, .pptx, and .xlsx files
DocumentTextExtractor provides simple methods for extracting text and metadata from .docx, .pptx, and .xlsx files.
New in v1.0.x
- Initial release
- Support for
docx
,pptx
, andxlsx
Disclaimer
This library has been tested on a limited set of documents. It is highly likely that documents exist this from which the library, in its current state, cannot extract text.
Simple Examples
Refer to the Test
project for a full example.
using DocumentTextExtractor;
void Main(string[] args)
{
using (DocxTextExtractor docx = new DocxTextExtractor("./temp/", "mydocument.docx"))
{
string docxText = docx.ExtractText();
Dictionary<string, string> docxMetadata = docx.ExtractMetadata();
}
using (PptxTextExtractor pptx = new DocxTextExtractor("./temp/", "mypresentation.pptx"))
{
string pptxText = pptx.ExtractText();
Dictionary<string, string> pptxMetadata = pptx.ExtractMetadata();
}
using (XlsxTextExtractor xlsx = new XlsxTextExtractor("./temp/", "mypresentation.pptx"))
{
string xlsxText = xlsx.ExtractText();
Dictionary<string, string> xlsxMetadata = xlsx.ExtractMetadata();
}
}
Version History
Please refer to CHANGELOG.md.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
net6.0
- System.Text.Json (>= 7.0.3)
- XmlToPox (>= 1.0.3)
-
net7.0
- System.Text.Json (>= 7.0.3)
- XmlToPox (>= 1.0.3)
-
net8.0
- System.Text.Json (>= 7.0.3)
- XmlToPox (>= 1.0.3)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Initial release