ePrivacy and GPDR Cookie Consent by Cookie Consent Skip to main content

Connector Google Cloud Storage Service Account

The Google Cloud Storage connector allows you to connect the Google Cloud Storage bucket with Meiro Integrations so that it can be used as a data source for the data workflow. 

This component enables to connect through the service account.

Learn more: about Google Cloud Storage service accounts here.

Requirements

To set up the configuration for the Google Cloud Storage connector, you will need a Google Cloud Storage account. You can create your account here.

Learn more: If you have not used Google Cloud Storage before, we recommend you to check the official Google Cloud Storage documentation.

Features

Wildcard

Using * (wildcard) at the end of an expression in a key field allows you to search for the necessary files in the bucket. For example, use MyFolder/* to connect to all the files in the directory MyFolder.

Data In/Data Out

Data In N/A
Data Out

Your results will be saved in the Data Out bucket data/out/files .

Remember: Prefix for loading data to the bucket can be set up using the path to the input files. Construct the path as /data/in/files/PREFIX/outputfile where `PREFIX` is the prefix for the bucket (can contain subfolders). As a result files will be loaded to `BUCKET/PREFIX` on Google Cloud Storage. 

Learn more: about the folder structure here.

Learn how: to move files from one folder in configuration to another using Command Line Interface Code processor please refer to this article.

Parameters

Google-Cloud-Storage-Parameters.png

Bucket  Name (required)

Provide a Google Cloud Storage bucket name. As all bucket names reside in a single Cloud Storage namespace. To find the bucket name go to your Google Cloud Storage account, go to the Storage section and the Browser. 

Learn more: about Google Cloud storage buckets here.

Search Key (required)

​In order to search for the key prefix for the files from the Google Cloud Storage bucket, it is possible to use a *  wildcard at the end. For example, if you would like to connect only the files from the folder myfolder, please insert myfolder/*.

 

Include Sub-folders (optional) By marking that field you will download the data from the bucket with all subfolders. This option is available only with the wildcard *.
New Files Only (optional)

Checking the New Files option enables you to load the data in incremental loads. 

After the first run of the configuration in the incremental mode, the file state.json of the Data Out bucket is showing properties like:

  • "last_downloaded_file_timestamp" which is the timestamp of the last change in the Google Cloud Storage connector  (seconds since Jan 01 1970 UTC)
  • "files_modified_in_last_downloaded_file_timestamp" that lists names of the processed files in the last Timestamp

Each time you run the configuration, Meiro Integrations is checking the values of these properties and downloads only the unprocessed files to the Data Out bucket.

Limit (required)

 

The Limit field describes the maximum number of files to download. If the key matches more files than chosen, only the oldest files will be downloaded.

 

SERVICE_ACCOUNT_KEY_FILE

In Google Cloud Storage Console,create a service account with these parameters. Simply copy-paste parameters from google Cloud storage Console. 

Warning: Private keys have new lines with  \n in them. Take the whole Private Key with the new line that turned into  \n , and insert it directly into the JSON (not into the form).

Authorisation with service account

{
  "service_account_key_file": {
    "type": "service_account",
    "project_id": "project-id",
    "private_key_id": "key-id",
    "private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
    "client_email": "service-account-email",
    "client_id": "client-id",
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://accounts.google.com/o/oauth2/token",
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
  }
}

service-account-key-file.png

client-id.png