CRAN status Binder

Contoso is a synthetic dataset containing sample sales transaction data for the fictional “Contoso” company. It includes various supporting tables for business intelligence, such as customer, store, product, and currency exchange data.

This dataset is perfect for practicing time series analysis, joins, financial modeling, or any business intelligence-related tasks.

It comes with a built-in dataset as well as the ability to create an in-memory database with duckdb

The package comes with the following tables:

Built into the package is the 10K row version of the dataset.

Using view(), you can see the columns’ label using the labelled package.

Inspiration to using labelled comes from Crystal Lewis excellent blog post

For larger datasets, use create_contoso_duckdb() with one of the following sizes:

Size Approx Sales Rows
small ~8,000
medium ~2.3 million
large ~47 million
mega ~237 million

Source

The data is originally sourced from the sqlbi github site

Dataset overview

The relationship keys that join each of the tables are listed below.

sales customer product store order orderrows fx
order_key order_key order_key
customer_key customer_key customer_key
store_key store_key store_key
product_key product_key product_key
currency_code from_currency

Installation

You can install the package from CRAN or the development version from GitHub:

install.packages("contoso")

Example

library(contoso)

# Create a DuckDB connection to Contoso datasets
db <- create_contoso_duckdb(size = "medium")

# Access the sales dataset
db$sales |> head()

# Launch the DuckDB UI to explore all tables interactively
launch_ui(db$con)

# Clean up when done
DBI::dbDisconnect(db$con, shutdown = TRUE)