Multiple chained queries in SQL on CSV processor
While working on data flow and using the SQL on CSV processor, you may encounter the problem of writing a series of queries where the result of the first query becomes the input for the following one. Depending on the complexity of the queries and the size of the table, nested queries are sometimes not the best option in this case.
One of the solutions for this problem is utilising a
WITH the clause which allows you to name the temporary table and to reference it in your query multiple times using its name. This way you can simplify the code, make it more readable and avoid multiple subqueries which makes the query less expensive in terms of performance. Here is a simple example of using a
WITH survived as ( SELECT * FROM TITANIC t WHERE t.Survived = 1 ) SELECT COUNT(*) FROM survived
After running the configuration you should see the result file
survived.csv in the Data Out bucket with the header and 1 row
The drawback of this solution is that you can chain only 2 queries and the result of the query inside the
WITH clause won’t be saved. Below we describe another solution that is not limited to 2 queries.
It is possible to use an output of one query as an input for the next one by applying the logic of the folder structure of Meiro Integration.
The folder structure of all components is similar and generally looks like this:
data/ in/ /tables /files out/ /tables /files
The result of each query in SQL on CSV processor is written by Meiro Integrations to the folder
data/out/tables. The input tables are read from the
data/in/tables by default, but it is possible to access the
data/out/tables by going two levels up to to the parent directory (
Let’s try to re-write the query from our earlier example into two separate queries.
SELECT * FROM TITANIC t WHERE t.Survived = 1
Input File (1)
Output File (1)
SELECT COUNT(*) FROM survived
Input File (2)
Note that we are accessing the Query (1) result in the Input File field.
Output File (2)
After running the configuration, 2 tables will be saved to
survived.csvthe result of a Query (1)
count_result.csvthe result of Query (2) where the input data was the table
Using this technique, you can combine as many queries as you need and, if necessary, save the results of the queries.
Learn more: about SQL on CSV processor, please refer to this article.