Skip to content
Posted on:April 10, 2021 at 01:00 AM

Datamon

Datamon

datamon

  1. version info
> d2 version
Version: 2.3.0
BuildDate: 2020-11-10T11:29:17Z
Commit: 17ce7d6
Working tree: clean
> datamon2
Datamon helps build ML pipelines by adding versioning, auditing and lineage tracking to cloud storage tools
(e.g. Google GCS, AWS S3).

This is not a replacement for these tools, but rather a way to manage their inputs and outputs.

Datamon works by providing a git like interface to manage data efficiently:
your data buckets are organized in repositories of versioned and tagged bundles of files.

Usage:
  datamon [command]

Available Commands:
  bundle      Commands to manage bundles for a repo
  config      Commands to manage the config file
  context     Commands to manage contexts.
  diamond     Commands to manage diamonds
  help        Help about any command
  label       Commands to manage labels for a repo
  repo        Commands to manage repos
  upgrade     Upgrades datamon to the latest release
  usage       Generates documentation
  version     prints the version of datamon
  web         Webserver

Flags:
      --config string             Set the config backend store to use (bucket name: do not set the scheme, e.g. 'gs://')
      --context string            Set the context for datamon (default "dev")
      --force                     Forces upgrade even if the current version is not a released version
  -h, --help                      help for datamon
      --loglevel string           The logging level. Levels by increasing order of verbosity: none, error, warn, info, debug (default "info")
      --metrics                   Toggle telemetry and metrics collection
      --metrics-password string   Password to connect to the metrics collector backend. Overrides any password set in URL
      --metrics-url string        Fully qualified URL to an influxdb metrics collector, with optional user and password
      --metrics-user string       User to connect to the metrics collector backend. Overrides any user set in URL
      --upgrade                   Upgrades the current version then carries on with the specified command

Use "datamon [command] --help" for more information about a command.
  1. config
d2 config set --config global-onec-co-datamon-config --context dev
**d2 config set --co**nfig global-onec-co-datamon-config --context staging
d2 config set --config workshop-config
  1. context
d2 context list
d2 context
d2 get context
d2 context
d2 context get --context dev
d2 context get dev
d2 context get dev
d2 context
d2 context create test

d2 context
d2 context list
d2 context get staging
d2 context
d2 context get dev
d2 context get
d2 context list
~ d2 context list
[dev prod staging]
~ d2 context get --context dev
Model Version: 0, Name: dev, WAL: dev-onec-co-datamon-metadata-wal, ReadLog: dev-onec-co-datamon-readlog, Blob: global-onec-co-datamon-blob, Metadata: dev-onec-co-datamon-metadata, Version Metadata: dev-onec-co-datamon-vmetadata
~ d2 context get --context staging
Model Version: 0, Name: staging, WAL: staging-onec-co-datamon-metadata-wal, ReadLog: staging-onec-co-datamon-readlog, Blob: global-onec-co-datamon-blob, Metadata: staging-onec-co-datamon-metadata, Version Metadata: staging-onec-co-datamon-vmetadata
~ d2 context get --context prod
Model Version: 0, Name: prod, WAL: prod-onec-co-datamon-metadata-wal, ReadLog: prod-onec-co-datamon-readlog, Blob: global-onec-co-datamon-blob, Metadata: prod-onec-co-datamon-metadata, Version Metadata: prod-onec-co-datamon-vmetadata
  1. repo
d2 repo get zenrin-estat-residential
d2 repo get --repo zenrin-estat-residential
d2 repo list | grep zenrin-estat-residential

d2 repo create --repo ntd-road-source-dev --description "raw download of ntd road data"
d2 repo list | grep ntd

d2 repo create --context staging --repo ntd-road-source-staging --description "the original ntd road data"
d2 repo list --context staging | grep ntd

d2 repo
d2 repo get zenrin-estat-residential
d2 repo get --repo zenrin-estat-residential
d2 repo list | grep zenrin-estat-residential
d2 repo get --repo resilience-japan-hazard-maps
d2 repo get --repo Seattle-Sample-Corelogic-Run-Data
  1. bundle
d2 bundle upload --path folder-to-upload --repo mkang-test-repo --message "my first upload"
d2 bundle list --repo mkang-test-repo
d2 bundle list files --repo mkang-test-repo --bundle 1fySBuavEhqWAXnYnZEiDCNm8TC
d2 bundle list files --repo mkang-test-repo --bundle 1fySBuavEhq 2>dev/null | grep file
d2 bundle list files --repo mkang-test-repo --bundle 1fySBuavEhq 2>/dev/null | grep file
d2 bundle mount --repo mkang-test-repo --mount ~/mnt -demonize
d2 bundle mount --repo mkang-test-repo --label 1fySBuavEhqWAXnYnZEiDCNm8TC --mount ~/mnt -demonize
d2 bundle mount --repo mkang-test-repo  --mount ~/mnt --demonize
d2 bundle mount --repo mkang-test-repo --mount ~/mnt --daemonize
d2 bundle download --repo mkang-test-repo --destination .
d2 bundle list --repo ntd-road-source-staging --context staging
d2 bundle list --repo ntd-road-source-dev --context dev
d2 bundle upload --path ~/occ/prod/01-built-object-service/tmp/ntd/RI --repo ntd-road-source-dev
d2 bundle upload --path ~/occ/prod/01-built-object-service/tmp/ntd/RI --repo ntd-road-source-dev --message "upload RI"
d2 bundle list
d2 bundle list --repo ntd-road-source-dev
d2 bundle
d2 bundle mount --repo ntd-road-source-dev
d2 bundle list --repo zenrin-estat-residential
d2 bundle list --repo resilience-japan-hazard-maps
d2 bundle list --repo Seattle-Sample-Corelogic-Run-Data
  1. label
d2 label set --repo mkang-test-repo --bundle 1fySBuavEhqWAXnYnZEiDCNm8TC --label my-first-upload
d2 label set --repo mkang-test-repo --bundle 1fy --label the-second-label
d2 label set --repo mkang-test-repo --bundle 1fySBu --label the-second-label
d2 label set --repo mkang-test-repo --bundle 1fySBuavEhqWAXnYnZEiDCNm8TC --label the-second-label
d2 label list --repo mkang-test-repo
d2 label list --repo mkang-test-repo --prefix my
d2 *label list --repo mkang-test-repo --prefix the*
d2 label list --repo zenrin-estat-residential
  1. webserver
d2 web
http://0.0.0.0:65242