Patter.Net
2.0.4
dotnet add package Patter.Net --version 2.0.4
NuGet\Install-Package Patter.Net -Version 2.0.4
<PackageReference Include="Patter.Net" Version="2.0.4" />
paket add Patter.Net --version 2.0.4
#r "nuget: Patter.Net, 2.0.4"
// Install Patter.Net as a Cake Addin #addin nuget:?package=Patter.Net&version=2.0.4 // Install Patter.Net as a Cake Tool #tool nuget:?package=Patter.Net&version=2.0.4
Patter.net
A simple pattern matching library for fluently extracting patterned data from text.
Rationale
I frequently find myself trying to pull a bit of structured data out of a stream of text. Regex is an amazingly powerful tool, but I have always struggled to get it to return structured results, and frequently just end up writing my own little parser to extract the information I want. Patter is a simple library to describe seeking and grabbing the chunks of data you want without the complexity of Regex.
- Is it more powerful than Regex? Absolutely not, if Regex is your jam, use Regex.
- Is it easier to read and get data out? In many cases (and in my honest opinion) yes.
Extracting text
You use PatterBuilder<string> to define a Pattern which knows how to extract data from a string.
For example:
var pattern = new PatternBuilder<string>()
.SeekPast("<foo>")
.CaptureUntil("</foo>")
.Build();
var results = pattern.Matches("Show <foo>one</foo> and <foo>two</foo>");
returns a list of strings,
["one","two"]
Extracting complex objects
Let's say you want to extract anchors from a blob of textinto an object (Alink):
public class ALink
{
public string Text {get;set;}
public Uri Url {get;set;}
}
And then define a pattern using PatternBuilder with ALink as the result type:
// define a patter to return enumeration of ALink objects.
var pattern = new PatternBuilder<ALink>()
// seek to <a
.Seek("<a")
// seek past href attribute
.SeekPast("href=")
// skip quotes if there any
.Skip(Chars.Quotes)
// Capture everything up to closing tag or end quote, and convert it a Uri and store in Alink.Url
.CaptureUntil(">'\"".ToArray(), (context) => context.Match.Url = new Uri(context.MatchText))
// skip quotes if there any
.Skip(Chars.Quotes)
// seek past end of opening tag
.SeekPast(">")
// capture everything up to the close </a tag and put it into the Alink.Text
.CaptureUntil("</a", (context) => context.Match.Text = context.MatchText.Trim())
.Build();
var matches = pattern.Matches("this is a <a href=\"http://foo.com\">link1</a> <a href=http://bar.com>link2</a>").ToList();
Debug.WriteLine(JsonConvert.SeriializeObject(matches));
This will extract the text and urls from the tags. It's an enumerable, so you can use LINQ statements to further manipulate the results.
[
{
"Text":"link1"
"Url":"http://foo.com"
},
{
"Text":"link2"
"Url":"http://bar.com"
}
]
Methods
Method | Description |
---|---|
Seek(text) | Move the cursor to next instance of text |
Seek(char[]) | Move the cursor to next instance of one of the chars |
SeekPast(text) | Move the cursor to just past the next instance of text |
SeekPast(char[]) | Move the cursor to just first instance of set of chars and then to first instance of not the chars |
Skip(char[]) | Move the cursor to first char that is not in the set of chars |
Capture(char[], func) | Capture chars while they are in the set of chars, call func(context) to give you ability to extra info from the context.MatchText and put into context.Match |
CaptureUntil(text, func) | Capture characters until text is found, then call func(context) to give you ability to extract info from the context.MatchText and put into the context.Match |
CaptureUntil(char[], func) | Capture characters until one of chars is found, call func(context) to give you ability to extract info from the context.MatchText and put into the context.Match |
CaptureUntilPast(text, func) | Capture characters until text is found including text, then call func(context) to give you ability to extract info from the context.MatchText and put into the context.Match |
CaptureUntilPast(char[], func) | Capture characters until one of chars is found, including all chars, call func(context) to give you ability to extract info from context.MatchText and put into context.Match |
Call(func) | Let's you write a custom pattern operation, you are responsible for changing context properties directly (Pos, MatchText, Match, HasMatch etc) |
PatternContext
The PatternContext
object represents the current state of parser and is passed to It has the following properties of interest
Property | Description |
---|---|
Pos | The current index into the string. It will be -1 when you are past the end of the string. |
Text | The full text of the string that is being worked on |
MatchText | The current matched text for a **CaptureXXX() ** method |
HasMatch | Indicates that there is a match to be returned in the enumeration. At the end of enumerating the operations if there is a HasMatch context.Match is yielded to the caller. |
Match | The object of type T that is yielded to the caller. You modify this object to build up the object that is yielded to the caller as a match. |
CurrentChar | Shortcut for the current char value for the current Pos. If it has Pos == -1 it will be (char)0 |
Memory | A Property bag scoped to all matches. This is useful for custom actions to track data across all matches |
MatchMemory | A Property bag scoped to each match. It is reset when a sequence of operations is completed and a match is returned to caller. |
Chars
The Chars class defines classes of useful characters for matching:
Name | Description |
---|---|
Chars.Digits | Digits - 0..9 |
Chars.Letters | Alphabetical ascii letters |
Chars.LettersOrDigits | Digits and Letters combined |
Chars.Quotes | Single and Double quotes |
Chars.SingleQuote | Single Quotes |
Chars.DoubleQuote | Double Quotes |
Chars.Whitespace | Whitespace chars (tab, space, EOL, etc.) |
Chars.EOL | End of line chars (\r, \n) |
Example:
var pattern = new PatternBuilder<string>()
.SeekPast("Name:")
.Skip(Chars.Whitespace)
.Capture(Chars.LettersOrDigits)
.SkipPast(Chars.EOL)
.Build();
Technical notes
Patterns are 100% reusable and thread safe (meaning multiple threads can be evaluating a Patter pattern against strings safely).
Changes from 1.x
Version was bumped to major 2.x for semantic versioning rules, aka it has breaking changes which clean up the usage around character matching methods.
- Switched to PatternBuilder().Build() ⇒ Pattern(), which makes it clearer when you are defining the pattern versus using the pattern. Only Pattern(T)() has Matches() method.
- functions were simplified to simply using char[] as the signature to know it's character based pattern, renaming methods like SeekChars() ⇒ Seek(char[] )
- char[] methods as appropriate use
params
nomenclature, so you can write.Skip('x','y','z')
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.