Folder Structure
Each configuration in Meiro Integrations is an independently running Docker container with a predefined folder structure. All files with data and files created in the data flow are saved in specific folders.
The structure of all directories in your configuration:
/data
config.json
/in
/tables
/files
/out
/tables
/files
Depending on the type of component that you are using, the structure can be expanded, as explained below.
Data
/data
is a root directory of your configuration and all working files and folders are saved here.
The root directory of Meiro Integrations contains:
- A JSON file
config.json
.containt all the parameters of your configuration.
Refer: to this article to learn more.
- A
/in
folder ( an input bucket for the configuration). - A
/out
folder ( an output bucket for the configuration).
Optional files and folders
-
.py
and.r
script files in Python processor or R processor configurations respectively. Meiro Integrations keeps code from the script field in the configuration in the filescript.py
or filescript.r
.
For Python and R processors folder structure would look like:
/data
config.json
script.py (or script.r)
/in
/tables
/files
/out
/tables
/files
- A
/repository
folder, where Meiro Integrations clones git repository if you are using Python Git processor or R Git processor components.
For Python from GIT and R from GIT processors folder structure would look like:
/data
config.json
/in
/tables
/files
/out
/tables
/files
/repository
Data/In
The /data/in
directory, with its subfolders, is an input bucket containing input files for the configuration.
When using a processor or a loader, you will notice that all output files from the previous configuration in your data flow that are connected to your processor or loader are automatically saved in the bucket data/in
and the corresponding subfolders.
For connectors, this folder will remain empty.
Data/Out
The /data/out
directory, with its subfolders, is an output bucket for output files of the configuration.
All files from the /data/out
folder and its subfolders will be saved in the input bucket (and its corresponding subfolders) of the next component of your data flow.
Data extracted in the connector components will be saved in the /data/out
folder once the configuration is run.
Data transformed inside the processor components with the use of scripts will be saved in the /data/output/files
or /data/output/tables
folders.
For loader components, the data will be exported to a third-party application or a database and therefore the folder data/out
will remain empty.
Files and Tables
Both Data In and Data Out buckets consist of two subfolders:
tables
is used for CSV tables,
files
is used for other types of files.
However, file storage may vary depending on the connectors and loaders that you choose. For example, the AWS S3 connector component saves all the files regardless of their type into the /data/out/files
and the AWS S3 loader requires to have all files and tables in the folder /data/in/files
.
Depending on the component, it is possible to save the data in subfolders of files
and tables
folders as data/out/files/newfolder
, data/out/tables/newfolder/subfolder
, etc.
Names of the tables and files should contain only alphanumeric characters (`a-z`, `A-Z`, `0-9`) and `_`, `-` symbols.
Please note that you can preview tables in a table or CSV format.
The process of creating subfolders varies from component to component. In order to learn more, please refer to the particular article from the Meiro Integrations user documentation.
Read more: about connectors, processors, and loaders in the articles available online.
No Comments