Iconic.Transliterator 1.1.1

There is a newer version of this package available.
See the version list below for details.
dotnet add package Iconic.Transliterator --version 1.1.1                
NuGet\Install-Package Iconic.Transliterator -Version 1.1.1                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Iconic.Transliterator" Version="1.1.1" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Iconic.Transliterator --version 1.1.1                
#r "nuget: Iconic.Transliterator, 1.1.1"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Iconic.Transliterator as a Cake Addin
#addin nuget:?package=Iconic.Transliterator&version=1.1.1

// Install Iconic.Transliterator as a Cake Tool
#tool nuget:?package=Iconic.Transliterator&version=1.1.1                

Transliterator

A library that can help with any text transliteration, like slug creation. It supports multiple languages and can accept new ones very easily.

Usage

You can install the library through NuGet

NuGet\Install-Package Iconic.Transliterator -Version 1.0.0

Here is a brief sample of how you can use the library

var transliterator = new Transliterator();
transliterator.addConversion(new GreekToEnglish());

var message = "Σεβόμαστε την ιδιωτικότητά σας";
var transliterated = transliterator.convert(message);  // "Sevomaste tin idiotikotita sas"

transliterator.addConversion(new EnglishToSlug());

var slug = transliterator.convert(message); // "sevomaste-tin-idiotikotita-sas"

You can add all the conversions that you want, and they will be applied in series.

Languages

New conversions can be added by creating a new class and implementing the ConversionInterface. The functionality splits letters into three categories.

  • Plain letters that are replaced with their corresponding latin one-to-one. For example the Greek β corresponds to b.
  • Combinations of more than one letters that are pronounced differently according to what comes before or after them. For example ευ is pronounced ef if it is followed by κ, and ev if it is followed by α. The combinations can be many, and you need to include them all, for example ευα corresponds to eva, and ευκ corresponds to efk.
  • Letters that can be represented by dual latins. For example the German ä which sounds something like ae.

The reason for this distinction is that those categories have to be transliterated separately and in series in order to not corrupt the composite ones.

The transliterator first replaces the Combinations, then the plain letters, and finally the dual ones. That way, the combinations are not destroyed by plain letter replacements, and the replaced Duals are not destroyed by accidental letter or combination occurances in the already transliterated text.

Letter and combination matches should be defined in lower case. The transliterator will try to replace and maintain capitalization of occurances that are either all lower case, all caps, or capitalized. Any weird capitalization like mid-word caps can cause weird misses or hits.

The final part of the Conversion Interface defines a general purpose transformation for the entire text. Can be used for things like ToLower() etc.

This way, we can create conversion classes to do whatever transformations we need. For example, we could create a Profanity class, and replace all nasty words in our Profanity DB table with ******.

There are currently classes for GreekToEnglish, GermanToEnglish and EnglishToSlug. Conversion choices are not meant to be proper Romanization, but rather what would be easy to understand in social media. What is usually known as Greeklish. I don't speak German, so I can't verify the functionality or completeness of the class. I just used it as an example to test Dual letters. Please help if you see something wrong. The Slug is a pseudo language that defines the replacement rules to produce proper slugs. It currently eliminates multiple spaces, replaces them with dashes, and converts everything to lower case.

TODO:

  • Add functionality to allow more complex conversions like markdown to html
Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net6.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
2.0.0 295 3/12/2023
1.1.2 233 2/27/2023
1.1.1 236 2/27/2023
1.0.0 250 2/26/2023

Initial Release