Connector AWS S3
Amazon Simple Storage Service (S3) is a scalable and highly durable cloud storage service provided by Amazon Web Services (AWS). It enables users to store and retrieve any amount of data at any time, making it a reliable solution for data storage in the cloud.
AWS S3 can seamlessly integrate various data processing and analytics tools, enabling organizations to efficiently transfer, transform, and load data into their CDP. This makes it a pivotal component in creating a streamlined and scalable data pipeline, facilitating the flow of information between storage and analytics platforms within the AWS ecosystem.
Setting up the connector in MI
The AWS S3 connector allows you to connect an AWS S3 bucket with Meiro Integrations and use it as a data source for the data workflow.
Requirements
You need an AWS account with an S3 bucket in it to set up the configuration for an AWS S3 connector. You can create your account here. If you have not used AWS before, we recommend you to check out these articles:
- How to create and activate AWS account
- How to create an S3 Bucket
- How do I create an AWS Access Key?
- Where is My Secret Access Key?
- Best Practices for Managing AWS Access Keys
Features
Incremental mode
The extractor keeps track of the downloaded files in a state file and runs only the unprocessed files.
Wildcard
Using * (wildcard) at the end of an expression in a key field allows you to search for the necessary files in the bucket. For example, use MyFolder/*
to connect to all the files in the directory MyFolder
.
Data In/Data Out
Data In |
N/A |
Data Out |
Archived files (GZip), located in the |
Learn more: about the folder structure please go to this article.
Parameters
Access Key ID (required) |
The AWS Access Key ID, looks like AKIA**** and you need to create it in the Credential section of your AWS S3 account: My_AWS -> My Security Credentials -> Access keys (access key ID and secret access key) -> Create New Access Key -> Download Key File More details: how to create your AWS S3 Access Key can be found here. |
Secret Access Key (required) |
The AWS Secret Access Key is provided by the AWS when you create a new AWS Access Key: My_AWS -> My Security Credentials -> Access keys (access key ID and secret access key) -> Create New Access Key -> Download Key File. More details: how to create your AWS S3 Secret Access Key can be found here. |
Bucket (required) |
Provide an AWS S3 bucket name which is a globally unique identifier and the region will be autodetected. |
Key (required) |
Search the key prefix for the files in the AWS S3 bucket, it can optionally be used with a * wildcard at the end. For example, if you want to connect only the files from a particular folder “myfolder” in the bucket, you should input myfolder/* . |
Save As (optional) |
Provide the name of the folder inside the /data/out/files directory where you’d like to store your downloaded files. If not indicated, files will be saved in /data/out/files . |
Include Sub-Folders (true/false) |
Download data from the bucket with all subfolders. Available only with the wildcard * . |
New Files Only (true/false) |
Turns on an incremental mode of loading the data. After the first configuration run in the incremental mode, the file
Each time you run the configuration, Meiro Integrations will check the values of these properties and download only the unprocessed files to the output bucket. |
Limit (required) |
The maximum number of files to download. If the key matches more files, the oldest files will be downloaded. |
No Comments