GCS (Google Cloud Storage) Destination Plugin

Latest: v2.0.0

This destination plugin lets you sync data from a CloudQuery source to remote GCS (Google Cloud Storage) storage in various formats such as CSV, JSON and Parquet.

This is useful in various use-cases, especially in data lakes where you can query the data direct from Athena or load it to various data warehouses such as BigQuery, RedShift, Snowflake and others.

Example

This example configures a GCS destination, to create CSV files in gcs://bucket_name/path/to/files. Note that the GCS plugin only supports append write-mode.

The (top level) spec section is described in the Destination Spec Reference.

kind: destination
spec:
  name: "gcs"
  path: "cloudquery/gcs"
  version: "v2.0.0"
  write_mode: "append" # gcs only supports 'append' mode
  # batch_size: 10000 # optional
  # batch_size_bytes: 5242880 # optional
  spec:
    bucket: "bucket_name"
    path: "path/to/files"
    format: "csv"
    format_spec:
      delimiter: ","

The GCS destination utilizes batching, and supports batch_size and batch_size_bytes.

GCS Spec

This is the (nested) spec used by the CSV destination Plugin.

bucket (string) (required)

Bucket where to sync the files.
path (string) (required)

Path to where the files will be uploaded in the above bucket.
format (string) (required)

Format of the output file. Supported values are csv, json and parquet.
format_spec (map format_spec) (optional) Optional parameters to change the format of the file

format_spec

delimiter (string) (optional) (default: ,)

Character that will be used as want to use as the delimiter if the format type is csv
skip_header (bool) (optional) (default: false)

Specifies if the first line of a file should be the headers (when format is csv).

Overview Overview