DotMP 1.5.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package DotMP --version 1.5.0                
NuGet\Install-Package DotMP -Version 1.5.0                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="DotMP" Version="1.5.0" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add DotMP --version 1.5.0                
#r "nuget: DotMP, 1.5.0"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install DotMP as a Cake Addin
#addin nuget:?package=DotMP&version=1.5.0

// Install DotMP as a Cake Tool
#tool nuget:?package=DotMP&version=1.5.0                

DotMP

Nuget Build Tests Quality Codecov All Contributors

DotMP logo

A library for writing OpenMP-style parallel code in .NET. Inspired by the fork-join paradigm of OpenMP, and attempts to replicate the OpenMP programming style as faithfully as possible, though breaking spec at times.

Link to repository.

Installing DotMP via NuGet

The easiest way to install DotMP is from the NuGet package manager:

dotnet add package DotMP

Building DotMP from Source

First, clone DotMP and navigate to the source directory:

git clone https://github.com/computablee/DotMP.git
cd DotMP

DotMP can be built using the make command. To build the entire project, including all tests, examples, and documentation, run the following command:

make

This command will build the main library, all tests, examples, benchmarks, and the documentation into their respective directories, but will not run any tests.

To build only the main library, run the following command:

make build

To build only the tests, run the following command:

make tests

To run the tests, run the following command:

make test

To build only the examples, run the following command:

make examples

This will build all of the examples, including the native C# parallelized, the DotMP parallelized, and the sequential examples. You can also individually build each of these classes of examples by running one or all of the following commands:

make examples-cs
make examples-dmp
make examples-seq

To build only the benchmarks, ruin the following command:

make benches

Documentation

You can use Doxygen to build the documentation for this project. A Doxyfile is located in the root of the project directory. To build the documentation, run the following command:

make docs

This will generate documentation in the root of the project under the docs directory in both LaTeX and HTML formats. A copy of the most up-to-date documentation is hosted on GitHub.

Supported Constructs

Parallel

Given the OpenMP:

#pragma omp parallel
{
    work();
}

DotMP provides:

DotMP.Parallel.ParallelRegion(() => {
    work();
});

This function supports the num_threads optional parameter, which sets the number of threads to spawn. The default value is the number of logical threads on the system.

For

Given the OpenMP:

#pragma omp for
for (int i = a, i < b; i++)
{
    work(i);
}

DotMP provides:

DotMP.Parallel.For(a, b, i => {
    work(i);
});

This function supports the schedule optional parameter, which sets the parallel scheduler to use. Permissible values are DotMP.Schedule.Static, DotMP.Schedule.Dynamic, DotMP.Schedule.Guided, and DotMP.Schedule.Runtime. The default value is DotMP.Schedule.Static.

This function supports the chunk_size optional parameter, which sets the chunk size for the scheduler to use. The default value is dependent on the scheduler and is not documented, as it may change from version to version.

The behavior of DotMP.Parallel.For is undefined if not used within a ParallelRegion.

Parallel For

Given the OpenMP:

#pragma omp parallel for
for (int i = a, i < b; i++)
{
    work(i);
}

DotMP provides:

DotMP.Parallel.ParallelFor(a, b, i => {
    work(i);
});

This function supports all of the optional parameters of ParallelRegion and For, and is merely a wrapper around those two functions for conciseness.

For with Reduction

Given the OpenMP:

type local = c;

#pragma omp for reduction(op:local)
for (int i = a; i < b; i++)
{
    local `op` f(i);
}

DotMP provides:

type local = c;

DotMP.Parallel.ForReduction(a, b, op, ref local, (ref type local, int i) => {
    local `op` f(i);
});

op is a value provided by the DotMP.Operations enum, which supports the values Add, Subtract, Multiply, BinaryAnd, BinaryOr, BinaryXor, BooleanAnd, BooleanOr, Min, and Max. The operation on local is an operator corresponding to the operator specified by DotMP.Operations, including +, -, *, &, |, ^, and so on.

This function supports all of the optional parameters of For.

Parallel For with Reduction

Given the OpenMP:

type local = c;

#pragma omp parallel for reduction(op:local)
for (int i = a; i < b; i++)
{
    local `op` f(i);
}

DotMP provides:

type local = c;

DotMP.Parallel.ParallelForReduction(a, b, op, ref local, (ref type local, int i) => {
    local `op` f(i);
});

This function supports all of the optional parameters of ParallelRegion and ForReduction, and is merely a wrapper around those two functions for conciseness.

For with Collapse

Given the OpenMP:

#pragma omp for collapse(n)
for (int i = a, i < b; i++)
    for (int j = c, j < d; j++)
        // ...
            for (int k = e; k < f; k++)
                work(i, j, /* ... */, k);

DotMP provides:

DotMP.Parallel.ForCollapse((a, b), (c, d), /* ... */, (e, f),
                           (i, j, /* ... */, k) => {
    work(i, j, /* ... */, k);
});

If four or fewer loops are being collapsed, overloads of ForCollapse exist to easily collapse said loops. If greater than four loops are being collapsed, then the user should pass an array of tuples as the first argument, and accept an array of indices in the lambda.

This function supports all of the optional parameters of For.

For with Reduction and Collapse

Given the OpenMP:

type local = c;

#pragma omp for reduction(op:local) collapse(n)
for (int i = a, i < b; i++)
    for (int j = c, j < d; j++)
        // ...
            for (int k = e; k < f; k++)
                local `op` f(i, j, /* ... */, k);

DotMP provides:

type local = c;

DotMP.Parallel.ForReductionCollapse((a, b), (c, d), /* ... */, (e, f), op, ref local,
                                    (ref type local, int i, int j, /* ... */, int k) => {
    local `op` f(i, j, /* ... */, k);
});

This function is a combination of ForCollapse and ForReduction, and supports all of the optional parameters thereof.

Parallel For with Collapse

Given the OpenMP:

#pragma omp parallel for collapse(n)
for (int i = a, i < b; i++)
    for (int j = c, j < d; j++)
        // ...
            for (int k = e; k < f; k++)
                work(i, j, /* ... */, k);

DotMP provides:

DotMP.Parallel.ParallelForCollapse((a, b), (c, d), /* ... */, (e, f),
                                   (i, j, /* ... */, k) => {
    work(i, j, /* ... */, k);
});

This function supports all of the optional parameters of ParallelRegion and ForCollapse, and is merely a wrapper around those two functions for conciseness.

Parallel For with Reduction and Collapse

Given the OpenMP:

type local = c;

#pragma omp parallel for reduction(op:local) collapse(n)
for (int i = a, i < b; i++)
    for (int j = c, j < d; j++)
        // ...
            for (int k = e; k < f; k++)
                local `op` f(i, j, /* ... */, k);

DotMP provides:

type local = c;

DotMP.Parallel.ParallelForReductionCollapse((a, b), (c, d), /* ... */, (e, f), op, ref local,
                                            (ref type local, int i, int j, /* ... */, int k) => {
    local `op` f(i, j, /* ... */, k);
});

This function supports all of the optional parameters of ParallelRegion and ForReductionCollapse, and is merely a wrapper around those two functions for conciseness.

Sections

Given the OpenMP:

#pragma omp sections
{
    #pragma omp section
    {
        work();
    }
    #pragma omp section
    {
        work2();
    }
}

DotMP provides:

DotMP.Parallel.Sections(() => {
    work();
}, () => {
    work2();
});

Parallel Sections

Given the OpenMP:

#pragma omp parallel sections
{
    #pragma omp section
    {
        work();
    }
    #pragma omp section
    {
        work2();
    }
}

DotMP provides:

DotMP.Parallel.ParallelSections(() => {
    work();
}, () => {
    work2();
});

This function supports the optional parameter num_threads from DotMP.Parallel.ParallelRegion.

Critical

Given the OpenMP:

#pragma omp critical
{
    work();
}

DotMP provides:

DotMP.Parallel.Critical(id, () => {
    work();
});

This function requires an id parameter, which is used as a unique identifier for a particular critical region. If multiple critical regions are present in the code, they should each have a unique id. The id should likely be a const int or an integer literal.

Barrier

Given the OpenMP:

#pragma omp barrier

DotMP provides:

DotMP.Parallel.Barrier();

Master

Given the OpenMP:

#pragma omp master
{
    work();
}

DotMP provides:

DotMP.Parallel.Master(() => {
    work();
});

Master's behavior is left undefined if used outside of a ParallelRegion.

Single

Given the OpenMP:

#pragma omp single
{
    work();
}

DotMP provides:

DotMP.Parallel.Single(id, () => {
    work();
});

The id parameter provided should follow the same guidelines as specified in Critical.

A Single region is only executed once per DotMP.Parallel.ParallelRegion, and is executed by the first thread that encounters it.

Single's behavior is left undefined if used outside of a ParallelRegion.

Ordered

Given the OpenMP:

#pragma omp ordered
{
    work();
}

DotMP provides:

DotMP.Parallel.Ordered(id, () => {
    work();
});

The id parameter provided should follow the same guidelines as specified in Critical.

Ordered's behavior is left undefined if used outside of a For.

Atomics

OpenMP atomics are implemented as follows:

#pragma omp atomic
a op b;

where op is some supported operator.

DotMP supports a subset of this for the int, uint, long, and ulong types. The only implemented atomic operations are a += b, a &= b, a |= b, ++a, and --a. a -= b is implemented, but for signed types only, due to restrictions interfacting with C#'s Interlocked class.

The following table documents the supported atomics:

Operation DotMP function
a += b DotMP.Atomic.Add(ref a, b)
a -= b DotMP.Atomic.Sub(ref a, b)
a &= b DotMP.Atomic.And(ref a, b)
a \|= b DotMP.Atomic.Or(ref a, b)
++a DotMP.Atomic.Inc(ref a)
--a DotMP.Atomic.Dec(ref a)

For atomic operations like compare-exchange, we recommend interfacting directly with System.Threading.Interlocked. For non-supported atomic operations or types, we recommend using DotMP.Parallel.Critical. This is more of a limitation of the underlying hardware than anything.

Locks

DotMP supports OpenMP-style locks. It is recommended to use C#'s native lock keyword where possible for performance. However, this API is provided to those who want the familiarity of OpenMP locks.

DotMP supports the DotMP.Lock object, which is the replacement for omp_lock_t. omp_init_lock and omp_destroy_lock are not implemented. Instead, users should instantiate the DotMP.Lock object using the new keyword.

DotMP provides the following functions:

<omp.h> function DotMP function Comments
omp_set_lock(lock) lock.Set() Halt the current thread until the lock is obtained
omp_unset_lock(lock) lock.Unset() Free the current lock, making it available for other threads
omp_test_lock(lock) lock.Test() Attempt to obtain a lock without blocking, returns true if locking is successful

Shared Memory

DotMP supports an API for declaring thread-shared memory within a parallel region. Shared memory is provided through the DotMP.Shared<T> class, which implements the IDisposable interface. DotMP.Shared<T> objects support implicit casting to type T, and can be used as such. They also allow an explicit DotMP.Shared<T>.Get() method to be used to retrieve the value of the shared variable. For setting, the DotMP.Shared<T>.Set(T value) method must be used.

For indexable types, such as arrays, the DotMP.SharedEnumerable<T> class is provided. This class implements the IDisposable interface, and supports implicit casting to the containing type. This class also overloads the [] operator to allow for indexing.

The following provides an example of a parallel vector initialization using DotMP.SharedEnumerable<T>:

static double[] InitVector()
{
    double[] returnVector;
    
    DotMP.Parallel.ParallelRegion(() =>
    {
        using (var vec = DotMP.SharedEnumerable.Create("vec", new double[1024]))
        {
            DotMP.Parallel.For(0, 1024, i =>
            {
                vec[i] = 1.0;
            });

            returnVector = vec;
        }
    });
    
    return returnVector;
}

The DotMP.Shared and DotMP.SharedEnumerable classes supports the following methods:

Method Action
DotMP.Shared.Shared(string name, T value) Initializes a shared variable with name name and starting value value
DotMP.Shared.Dispose() Disposes of a shared variable
DotMP.Shared.Set(T value) Sets a shared variable to value value
DotMP.Shared.Get() Gets a shared variable
DotMP.SharedEnumerable.SharedEnumerable(string name, U value) Initializes a shared array with name name and starting value value
DotMP.SharedEnumerable.Dispose() Disposes of a shared array
DotMP.SharedEnumerable.Get() Gets a shared enumerable as its containing type.

The DotMP.Shared constructor and Clear() methods serve as implicit barriers, ensuring that all threads can access the memory before proceeding.

DotMP.Shared provides a factory method for creating DotMP.Shared instances via the DotMP.Shared.Create() method. DotMP.SharedEnumerable provides factory methods for creating DotMP.SharedEnumerable instances containing either T[] or List<T> enumerables via the DotMP.SharedEnumerable.Create() methods.

Tasking System

DotMP supports a rudimentary tasking system. Submitting a task adds the task to a global task queue. When a tasking point is hit, threads will begin working on tasks in the task queue. There are two tasking points currently in DotMP:

  • At the end of a DotMP.Parallel.ParallelRegion, all remaining tasks in the task queue are completed
  • Upon encountering DotMP.Parallel.Taskwait, all current tasks in the task queue are completed

Tasks can be submitted throughout the execution of a parallel region, including from within other tasks, and support dependencies. Spawning tasks returns a DotMP.TaskUUID object which can be passed as a parameter to future tasks, marking those tasks as dependent on the originating task.

The following analogues to OpenMP functions are provided:

Task

Given the OpenMP:

#pragma omp task
{
    work();
}

DotMP provides:

DotMP.Parallel.Task(() => {
    work();
});

This function supports depends as a params parameter. depends accepts DotMP.TaskUUID objects, and marks the created task as dependent on the tasks passed through depends.

This function adds a task to the task queue and is deferred until a tasking point.

This function returns a DotMP.TaskUUID object, which can be passed to future depends clauses.

Taskwait

Given the OpenMP:

#pragma omp taskwait

DotMP provides:

DotMP.Parallel.Taskwait();

This function acts as a tasking point, as well as an implicit barrier.

Taskloop

Given the OpenMP:

#pragma omp taskloop
for (int i = a, i < b; i++)
{
    work(i);
}

DotMP provides:

DotMP.Parallel.Taskloop(a, b, i => {
    work(i);
});

This function supports the num_tasks optional parameter, which specifies how many tasks into which the loop is broken up.

This function supports the grainsize optional parameter, which specifies how many iterations belong to each individual task.

If both num_tasks and grainsize are provided, the num_tasks parameter takes precedence over the grainsize parameter.

This function supports the only_if optional parameter. only_if is an opportunity to provide a boolean expression to determine if the taskloop should generate tasks or execute sequentially. This is beneficial if the taskloop might be very small and wouldn't be worth the (albeit light) overhead of creating tasks and waiting on a tasking point.

This function supports depends as a params parameter. depends accepts DotMP.TaskUUID objects, and marks the created tasks as dependent on the tasks passed through depends.

This function adds a series of tasks to the task queue and is deferred until a tasking point.

This function returns a DotMP.TaskUUID[] array, where each element is a DotMP.TaskUUID representing one of the generated tasks. The DotMP.TaskUUID[] array can be passed to future depends clauses.

Parallel Master

Given the OpenMP:

#pragma omp parallel master
{
    work();
}

DotMP provides:

DotMP.Parallel.ParallelMaster(() => {
    work();
});

This function supports the num_threads optional parameter from DotMP.Parallel.ParallelRegion.

Master Taskloop

Given the OpenMP:

#pragma omp master taskloop
for (int i = a, i < b; i++)
{
    work(i);
}

DotMP provides:

DotMP.Parallel.MasterTaskloop(a, b, i => {
    work(i);
});

This function supports all of the optional parameters from DotMP.Parallel.Taskloop, except depends.

This function does not return a DotMP.TaskUUID[] array.

Parallel Master Taskloop

Given the OpenMP:

#pragma omp parallel master taskloop
for (int i = a, i < b; i++)
{
    work(i);
}

DotMP provides:

DotMP.Parallel.ParallelMasterTaskloop(a, b, i => {
    work(i);
});

This function supports all of the optional parameters from DotMP.Parallel.ParallelRegion and DotMP.Parallel.Taskloop, except depends.

This function does not return a DotMP.TaskUUID[] array.

Supported Functions

DotMP provides an analogue of the following functions:

<omp.h> function DotMP function Comments
omp_get_num_procs() DotMP.Parallel.GetNumProcs() Returns the number of logical threads on the system
omp_get_num_threads() DotMP.Parallel.GetNumThreads() Returns the number of active threads in the current region
omp_set_num_threads() DotMP.Parallel.SetNumThreads() Sets the number of threads for the next parallel region to use
omp_get_thread_num() DotMP.Parallel.GetThreadNum() Gets the ID of the current thread
omp_get_max_threads() DotMP.Parallel.GetMaxThreads() Gets the maximum number of threads the runtime may use in the next region
omp_in_parallel() DotMP.Parallel.InParallel() Returns true if called from within a parallel region
omp_set_dynamic() DotMP.Parallel.SetDynamic() Tells the runtime to dynamically adjust the number of threads, can disable by calling SetNumThreads
omp_get_dynamic() DotMP.Parallel.GetDynamic() Returns true if the runtime can dynamically adjust the number of threads
omp_set_nested() DotMP.Parallel.SetNested() Returns a NotImplementedException
omp_get_nested() DotMP.Parallel.GetNested() Returns false
omp_get_wtime() DotMP.Parallel.GetWTime() Returns the number of seconds since the Unix Epoch as a double
omp_get_schedule() DotMP.Parallel.GetSchedule() Gets the current schedule of the parallel for loop
omp_get_schedule() DotMP.Parallel.GetChunkSize() Gets the current chunk size of the parallel for loop
Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 is compatible.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net6.0

    • No dependencies.
  • net7.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Add collapsed worksharing for loops. Improve locking API. Bug fixes, documentation fixes, new logo, more rigorous testing, refactoring.