Streaming Data from HP Vertica into Google Cloud Storage

In this short article we will see how we can do a Streaming Transfers of data from a HP Vertica SQL(any database may apply) straight into Google Cloud Storage Bucket using gsutil tool. The secret of this task is to use ‘-‘ in place of src_url(source) or dst_url(destination) to perform a streaming transfer. Example:

runuser -l dbadmin -c 'vsql -U dbadmin -w secret_password -F $'','' -At -c "SELECT * FROM schema.table"' | gsutil cp - gs://bktv/schema.table.csv
Copying from <STDIN...
Uploading   gs://bktv/schema.table.csv:                         38.8 MiB/38.8 MiB
  •  this will create comma separated file that will be stored into the Google Storage bucket called gs://bktv in the file called schema.table.csv.
There some cons on using this type of data transfer such as :
  • they cannot be resumed, which means that is the upload get stuck in the middle you will have to restart the upload.(so use uploading from local files when dealing with large amounts of data)
  • no check sum is done on the streamed data so Google wont do any validation on the data.