RepoDatabricks (DBRX)Databricks (DBRX)published Mar 25, 2023seen 5d

databricks/docker-dev

Shell

Open original ↗

Captured source

source ↗
published Mar 25, 2023seen 5dcaptured 13hhttp 200method plain

databricks/docker-dev

Description: Arcion Demo Kit for testing database to database replication

Language: Shell

License: GPL-3.0

Stars: 3

Forks: 1

Open issues: 4

Created: 2023-03-25T14:08:54Z

Pushed: 2025-04-23T20:14:57Z

Default branch: main

Fork: no

Archived: no

README: More info at Arcion Demo Kit docs.

Overview

This is the Arcion Demo Kit. It is designed to demo and test Arcion replication from various data sources to targets. The diagram below describes the components of the demo kit. Please refer to https://docs.arcion.io for more info.

  • Load Generators
  • Yahoo Cloud Serving Benchmark YCSB
  • Carnegie Mellon Database Group BenchBase
  • Data sources
  • Arcion cluster with dedicated metadata database
  • Data destinations
graph LR
L[Load Generator
TPC-C
YCSB] --> S
subgraph Arcion Cluster
A1
M[(Meta
Data)]
end
S[(Source
Data)] --> A1[Arcion
UI]
A1 --> T[(Destination
Data)]

Getting started

Assumptions:

  • Run on Windows WSL2, Liunx or Mac
  • x64_64 (Intel, AMD) CPUs can run all databases
  • ARM64 (Apple Silicon, Tau, Graviton2) can run Oracle 19c
  • Access to a terminal
  • Access to a browser
  • Arcion License file replicant.loc

Install and Setup

OSX (Mac) prerequisites

brew install dialog
brew install jq
brew install git
brew install wget
brew install bash
brew install podman
brew install podman-desktop
pip3 install podman-compose
  • bash required for Demokit's install.sh
echo $(brew --prefix)/bin/bash | sudo tee -a /private/etc/shells
chpass -s $(brew --prefix)/bin/bash
  • podman with 512GB of disk, 16GB of RAM, and 8 CPUs
podman machine init --disk-size 512 --memory 16384 --cpus 8
podman machine start

Install Arcion Demo Kit

Cut and paste the following in a terminal.

git clone https://github.com/databricks/docker-dev
cat >>~/.profile /dev/null; docker compose stop; popd 2>/dev/null
done
  • down removes the container and its running image
for db in $( find * -maxdepth 2 -name docker-compose.yaml ); do
pushd $(dirname $db) 2>/dev/null; docker compose down; popd 2>/dev/null
done
for db in $( find * -maxdepth 2 -name docker-compose.yaml ); do
pushd $(dirname $db); docker compose up -d; popd
done

Cloud Database Examples

Snowflake

  • Snowflake source to MySQL destination

use the default on mysql destination

single thread each extractor and applier source catalog is SNOWFLAKE_SAMPLE_DATA and source schema is TPCH_SF1

SRCDB_DB=SNOWFLAKE_SAMPLE_DATA SRCDB_SCHEMA=TPCH_SF1 arcdemo.sh snapshot snowflake mysql

two threads each extractor and applier source catalog is default arcsrc and source schema is PUBLIC

arcdemo.sh -b 2:2 snpashot snowflake mysql

Oracle Docker Setup

Oracle requires container images to be built locally. Start with Oracle XE, then use Oracle EE for volume testing. Oracle XE does not require the extra step of downloading the Oracle EE binary. Oracle EE should be used for anything scale factor beyond 10.

Oracle XE

  • Build the image
cd oracle
git clone https://github.com/oracle/docker-images oracle-docker-images
pushd oracle-docker-images/OracleDatabase/SingleInstance/dockerfiles
./buildContainerImage.sh -v 21.3.0 -x -o '--build-arg SLIMMING=false'
popd
cd ..
  • Start service
docker compose -f oraxe2130/docker-compose.yaml up -d
  • A test examples

Scale factor 10 Snapshot inter table parallelism of 2

arcdemo.sh -s 10 -b 2:2 full oraxe pg

Testing with different tags of demokit

  • to install specific tag
export ARCION_WORKLOADS_TAG=23.07
/bin/bash -c "$(curl -k -fsSL https://raw.githubusercontent.com/databricks/docker-dev/${ARCION_WORKLOADS_TAG:-HEAD}/install.sh)"
  • to install DBs not listed in the menu
export ARCION_DOCKER_DBS=(db2 informix)
/bin/bash -c "$(curl -k -fsSL https://raw.githubusercontent.com/databricks/docker-dev/${ARCION_WORKLOADS_TAG:-HEAD}/install.sh)"

Generate Source / Target Matric

cd bin
./startall.sh
./recdemo.sh