Faborite.Core 0.1.1

.NET 10.0

dotnet add package Faborite.Core --version 0.1.1

NuGet\Install-Package Faborite.Core -Version 0.1.1

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Faborite.Core" Version="0.1.1" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Faborite.Core" Version="0.1.1" />
                    

                            Directory.Packages.props

<PackageReference Include="Faborite.Core" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Faborite.Core --version 0.1.1

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Faborite.Core, 0.1.1"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Faborite.Core@0.1.1

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Faborite.Core&version=0.1.1
                    

                            Install as a Cake Addin

#tool nuget:?package=Faborite.Core&version=0.1.1
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Faborite 🎯

Sync Microsoft Fabric lakehouse data locally for faster development.

Faborite lets you pull sample data from your Fabric Lakehouses to your local machine, so you can develop and test notebooks/scripts without waiting for cloud compute.

Why Faborite?

When working with Microsoft Fabric, you often need to:

🐢 Wait for cloud compute to spin up just to test a simple query
💸 Pay for compute time during development iterations
🔄 Context-switch between local and cloud environments

Faborite solves this by bringing a representative sample of your data locally, enabling:

⚡ Instant iteration - No cold start, no waiting
💰 Cost savings - Develop locally, deploy to cloud
🧪 Better testing - Test with real data patterns locally

Features

🎲 Smart Sampling - Random, recent, stratified, or custom SQL sampling
📦 Multiple Formats - Export to Parquet, CSV, JSON, or DuckDB
⚡ Fast - Parallel downloads, DuckDB-powered sampling
🔧 Configurable - Sensible defaults, fully customizable per-table
🔐 Secure - Uses Azure authentication (CLI, Service Principal, Managed Identity)
🚀 Single Executable - Built with .NET 10 for fast startup and easy deployment
🛡️ Production Ready - Comprehensive validation, logging, and retry policies

Installation

Download Binary

Download the latest release from GitHub Releases:

Platform	Download
Windows (x64)	faborite-win-x64.zip
Linux (x64)	faborite-linux-x64.tar.gz
macOS (x64)	faborite-osx-x64.tar.gz
macOS (ARM64)	faborite-osx-arm64.tar.gz

As .NET Global Tool

dotnet tool install -g faborite

From Source

git clone https://github.com/mjtpena/faborite.git
cd faborite
dotnet build

Quick Start

az login

2. Initialize Configuration

faborite init

Edit faborite.json with your workspace and lakehouse IDs.

3. Sync Data

# Sync all tables with defaults (10,000 random rows each)
faborite sync --workspace <workspace-id> --lakehouse <lakehouse-id>

# Or use the config file
faborite sync

4. Use Your Data

import duckdb
df = duckdb.read_parquet('./local_lakehouse/customers/customers.parquet').df()

CLI Reference

`sync`

Sync data from OneLake to local machine.

faborite sync [options]

Option	Short	Description	Default
`--workspace`	`-w`	Workspace ID (GUID)	From config
`--lakehouse`	`-l`	Lakehouse ID (GUID)	From config
`--config`	`-c`	Config file path	`faborite.json`
`--rows`	`-n`	Number of rows to sample	10000
`--strategy`	`-s`	Sampling strategy	`random`
`--format`	`-f`	Output format	`parquet`
`--output`	`-o`	Output directory	`./local_lakehouse`
`--table`	`-t`	Tables to sync (repeatable)	All tables
`--skip`		Tables to skip (repeatable)	None
`--parallel`	`-p`	Max parallel downloads	4
`--no-schema`		Skip schema export	false

Examples:

# Sync specific tables
faborite sync -w <id> -l <id> --table customers --table orders

# Custom sampling
faborite sync -w <id> -l <id> --rows 5000 --strategy recent

# Export as DuckDB database
faborite sync -w <id> -l <id> --format duckdb

# Export as CSV
faborite sync -w <id> -l <id> --format csv

`list-tables` (alias: `ls`)

List available tables in a lakehouse.

faborite list-tables -w <workspace-id> -l <lakehouse-id>

`init`

Generate a sample configuration file.

faborite init [options]

Option	Short	Description	Default
`--output`	`-o`	Output file path	`faborite.json`
`--force`	`-f`	Overwrite existing file	false

`status`

Show status of locally synced data.

faborite status [options]

Option	Short	Description	Default
`--path`	`-p`	Local data directory	`./local_lakehouse`

Sampling Strategies

Strategy	Description	Use Case
`random`	Random sample using DuckDB's `USING SAMPLE`	General development
`recent`	Most recent rows by date column	Time-series data
`head`	First N rows	Quick testing
`tail`	Last N rows	Recent additions
`stratified`	Proportional sample by column	Categorical data
`query`	Custom SQL query	Complex filters
`full`	All rows (no sampling)	Small lookup tables

Configuration

Config File

Create a faborite.json file in your project root:

{
  "workspaceId": "your-workspace-guid",
  "lakehouseId": "your-lakehouse-guid",
  "sample": {
    "rows": 10000,
    "strategy": "random"
  },
  "format": {
    "output": "parquet",
    "compression": "snappy"
  },
  "sync": {
    "localPath": "./local_lakehouse",
    "parallelTables": 4,
    "includeSchema": true
  },
  "auth": {
    "method": "cli"
  },
  "tableOverrides": {
    "large_table": {
      "rows": 1000,
      "strategy": "recent",
      "dateColumn": "created_at"
    },
    "lookup_table": {
      "strategy": "full"
    }
  }
}

Environment Variables

All configuration can be overridden with environment variables:

Variable	Description
`FABORITE_WORKSPACE_ID`	Workspace ID
`FABORITE_LAKEHOUSE_ID`	Lakehouse ID
`FABORITE_OUTPUT_PATH`	Output directory
`FABORITE_SAMPLE_ROWS`	Default sample rows
`FABORITE_FORMAT`	Output format
`AZURE_TENANT_ID`	Azure tenant for service principal auth
`AZURE_CLIENT_ID`	Azure client ID for service principal auth
`AZURE_CLIENT_SECRET`	Azure client secret for service principal auth

Output Structure

./local_lakehouse/
├── customers/
│   ├── customers.parquet
│   └── _schema.json
├── orders/
│   ├── orders.parquet
│   └── _schema.json
├── products/
│   ├── products.parquet
│   └── _schema.json
└── lakehouse.duckdb     # When using --format duckdb

Authentication

Faborite uses Azure Identity for authentication:

Method	Description	Config
Azure CLI (default)	Uses `az login` credentials	`"method": "cli"`
Service Principal	App registration with secret	`"method": "serviceprincipal"`
Managed Identity	For Azure-hosted environments	`"method": "managedidentity"`
Default	Azure DefaultAzureCredential chain	`"method": "default"`

Service Principal Setup

# Set environment variables
export AZURE_TENANT_ID="your-tenant-id"
export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret"

# Update config
{
  "auth": {
    "method": "serviceprincipal",
    "tenantId": "your-tenant-id",
    "clientId": "your-client-id"
  }
}

Using with Notebooks

After syncing, load data in your local notebooks:

Python with DuckDB

import duckdb

# If exported as DuckDB
conn = duckdb.connect('./local_lakehouse/lakehouse.duckdb')
df = conn.execute("SELECT * FROM customers").df()

# If exported as Parquet
df = duckdb.read_parquet('./local_lakehouse/customers/customers.parquet').df()

Python with Pandas

import pandas as pd

df = pd.read_parquet('./local_lakehouse/customers/customers.parquet')

Python with Polars

import polars as pl

df = pl.read_parquet('./local_lakehouse/customers/customers.parquet')

.NET

using DuckDB.NET.Data;

using var connection = new DuckDBConnection("Data Source=./local_lakehouse/lakehouse.duckdb");
connection.Open();
// Query your data...

Requirements

Runtime: .NET 10.0 (or use self-contained builds)
Azure: Account with access to Microsoft Fabric
Permissions: Read access to the target Lakehouse via OneLake

Finding Your IDs

Workspace ID: Go to your Fabric workspace → Settings → Copy the Workspace ID
Lakehouse ID: Open your Lakehouse → Settings → Copy the Lakehouse ID

Development

Prerequisites

.NET 10.0 SDK
Azure CLI (for authentication)

Building

# Clone the repository
git clone https://github.com/mjtpena/faborite.git
cd faborite

# Build
dotnet build

# Run tests
dotnet test

# Run with coverage
dotnet test --collect:"XPlat Code Coverage"

# Run the CLI
dotnet run --project src/Faborite.Cli -- sync -w <workspace-id> -l <lakehouse-id>

Publishing

# Publish self-contained for Windows
dotnet publish src/Faborite.Cli -c Release -r win-x64 --self-contained -o publish/win-x64

# Publish self-contained for Linux
dotnet publish src/Faborite.Cli -c Release -r linux-x64 --self-contained -o publish/linux-x64

# Publish self-contained for macOS
dotnet publish src/Faborite.Cli -c Release -r osx-x64 --self-contained -o publish/osx-x64
dotnet publish src/Faborite.Cli -c Release -r osx-arm64 --self-contained -o publish/osx-arm64

Project Structure

faborite/
├── src/
│   ├── Faborite.Core/           # Core library
│   │   ├── Configuration/       # Config loading & validation
│   │   ├── OneLake/             # OneLake ADLS Gen2 client
│   │   ├── Sampling/            # DuckDB-powered sampling
│   │   ├── Export/              # Format exporters
│   │   ├── Logging/             # Logging infrastructure
│   │   ├── Resilience/          # Retry policies (Polly)
│   │   └── FaboriteService.cs   # Main orchestrator
│   └── Faborite.Cli/            # CLI application
│       ├── Commands/            # CLI commands
│       └── Program.cs           # Entry point
├── tests/
│   ├── Faborite.Core.Tests/     # Core library tests
│   └── Faborite.Cli.Tests/      # CLI tests
├── .github/
│   └── workflows/
│       └── ci.yml               # CI/CD pipeline
└── Faborite.sln

Roadmap

Delta Lake time travel support
Incremental sync (only changed data)
Schema drift detection
VS Code extension
GitHub Action for CI/CD pipelines
Support for Fabric Warehouses

Contributing

Contributions are welcome! Please read our Contributing Guide for details on our code of conduct and the process for submitting pull requests.

Quick Contribution Steps

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Security

Please see our Security Policy for reporting vulnerabilities.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

DuckDB - For blazing fast local analytics
Azure SDK for .NET - For Azure integration
Spectre.Console - For beautiful CLI output
Polly - For resilience policies

Made with ❤️ by Michael John Peña

Product	Compatible and additional computed target framework versions.
.NET	net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net10.0
- Azure.Identity (>= 1.13.1)
- Azure.Storage.Files.DataLake (>= 12.21.0)
- DuckDB.NET.Data (>= 1.2.0)
- DuckDB.NET.Data.Full (>= 1.2.0)
- Microsoft.Extensions.Configuration (>= 10.0.0)
- Microsoft.Extensions.Configuration.EnvironmentVariables (>= 10.0.0)
- Microsoft.Extensions.Configuration.Json (>= 10.0.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.0)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 10.0.0)
- Parquet.Net (>= 5.0.2)
- Polly (>= 8.4.2)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
0.1.1	36	1/7/2026
0.1.0	40	1/6/2026