Snowflake Cloud Data Warehouse V2 Connector > Snowflake pushdown optimization > Pushdown optimization using a Snowflake Cloud Data Warehouse V2 connection
  

Pushdown optimization using a Snowflake Cloud Data Warehouse V2 connection

You can configure pushdown optimization for a mapping that contains a Snowflake Cloud Data Warehouse V2 connection. Pushdown optimization enhances the mapping performance.
When you run a task configured for pushdown optimization, the task converts the transformation logic to Snowflake queries. The task sends the queries to Snowflake and the mapping logic is processed in the Snowflake database.

Read from Amazon S3 and write to Snowflake

You can configure pushdown optimization for a mapping that uses an Amazon S3 V2 connection in the Source transformation to read from Amazon S3 and a Snowflake Cloud Data Warehouse V2 connection in the Target transformation to write to Snowflake.

Example

You work for a healthcare organization. Your organization offers a suite of services to manage electronic medical records, patient engagement, telephonic health services, and care coordination services. The organization uses infrastructure based on Amazon Web Services and stores its data on Amazon S3. The management now plans to load data to a data warehouse to perform healthcare analytics and create data points to improve operational efficiency. To load data from an Amazon S3 based storage object to Snowflake, you must use ETL and ELT with the required transformations that support the data warehouse model.
Use an Amazon S3 V2 connection to read data from a file object in an Amazon S3 source and a Snowflake Cloud Data Warehouse V2 connection to write to a Snowflake target. Configure full pushdown optimization in the mapping to optimize the performance.
The Amazon S3 source data is uploaded to the Snowflake stage using the PUT command. The Snowflake COPY commands are used to convert the transformations to the corresponding SQL functions and expressions while loading the data to Snowflake. Pushdown optimization enhances the performance of the task and reduces the cost involved.

Read from and write to Snowflake

You can configure pushdown optimization in a mapping to read from and write to Snowflake using a Snowflake Cloud Data Warehouse V2 connection.

Example

You work in a car retail company with more than 50,000 dealerships and 5000 inspection centers globally. The company stores millions of records in Snowflake hosted on AWS. You want to use Data Integration to perform some transformations on the data before writing back to Snowflake.
Use a Snowflake Cloud Data Warehouse V2 connection in the mapping to read from the Snowflake source and write the processed data to the Snowflake target. Configure full pushdown optimization in the mapping to enhance the performance.

Read from Google Cloud Storage and write to Snowflake

You can configure pushdown optimization for a mapping that uses a Google Cloud Storage connection in the Source transformation to read from Google Cloud Storage and a Snowflake Cloud Data Warehouse V2 connection in the Target transformation to write to Snowflake.

Example

You work for healthcare solutions and your organization provides healthcare technology to pharmacies and pharmacy chains. You enable pharmacies to process prescriptions, store and provide access to healthcare records, and improve patient outcomes. Your organization stores its data in Google Cloud Storage.
The management wants to create a patient-centric pharmacy management system. The organization plans to leverage the warehouse infrastructure of Snowflake and load all its data to Snowflake so that they can make operational, financial, and clinical decisions with ease.
To load data from a Google Cloud Storage object to Snowflake, you must use ETL and ELT with the required transformations that support the data warehouse model.
Use a Google Cloud Storage V2 connection to read data from a Google Cloud Storage bucket and a Snowflake Cloud Data Warehouse V2 connection to write to a Snowflake target. Configure full pushdown optimization in the mapping to optimize the performance.
The Google Cloud Storage source data is uploaded to the Snowflake stage using the PUT command. The Snowflake COPY commands are used to convert the transformations to the corresponding SQL functions and expressions while loading the data to Snowflake. Pushdown optimization enhances the performance of the task and reduces the cost involved.

Read from Microsoft Azure Data Lake Storage Gen2 and write to Snowflake

You can configure pushdown optimization for a mapping that uses an Microsoft Azure Data Lake Storage Gen2 connection in the Source transformation to read from Microsoft Azure Data Lake Storage Gen2 and a Snowflake Cloud Data Warehouse V2 connection in the Target transformation to write to Snowflake.

Example

You want to load data from an Microsoft Azure Data Lake Storage Gen2 based storage object to Snowflake for analytical purposes. You want to transform the data before it is made available to users. Use an Microsoft Azure Data Lake Storage Gen2 connection to read data from a Microsoft Azure Data Lake Storage Gen2 source and a Snowflake Cloud Data Warehouse V2 connection to write to a Snowflake target. Configure full pushdown optimization in the mapping task to optimize the performance of loading data to Snowflake.
The Microsoft Azure Data Lake Storage Gen2 source data is uploaded to the Snowflake stage using the PUT command. The Snowflake COPY commands are used to convert the transformations to the corresponding SQL functions and expressions while loading the data to Snowflake. Pushdown optimization enhances the performance of the task and reduces the cost involved.

Pushdown compatibility

You can configure the task to push transformations, variables, functions, and operators to the database.
When you use pushdown optimization, the Secure Agent converts the expression in the transformation by determining equivalent operators, variables, and functions in the database. If there is no equivalent operator, variable, and function, the Secure Agent processes the transformation logic.

Supported functions for Snowflake Cloud Data Warehouse V2 mappings

When you use pushdown optimization, the Secure Agent converts the expression in the transformation by determining equivalent functions in the database. If there is no equivalent function, the Secure Agent processes the transformation logic.
The tables summarize the availability of pushdown functions in a Snowflake database. Columns marked with an X indicate that the function can be pushed to Snowflake by using full pushdown optimization. Columns marked with a dash (-) symbol indicate that the function cannot be pushed to the database.
Function
Pushdown
Function
Pushdown
Function
Pushdown
ABS()
X
LAST_DAY()
X
SINH()
X
ADD_TO_DATE()
X
LENGTH()
X
SQRT()
X
ASCII()
-
LN()
X
STDDEV()
-
AVG()
X
LOG()
X
SUBSTR()
X
CEIL()
X
LOWER()
X
SUM()
X
CHR()
X
LPAD()
X
SYSDATE()
X
CONCAT()
X
LTRIM()
X
SYSTIMESTAMP()
X
COS()
X
MAX()
X
TAN()
X
COSH()
X
MAKE_DATE_TIME
X
TANH()
X
COUNT()
X
MEDIAN()
X
TO_BIGINT
X
DATE_COMPARE()
-
MIN()
X
TO_CHAR(DATE)
X
DATE_DIFF()
X
MOD()
X
TO_CHAR(NUMBER)
X
DECODE()
X
POWER()
X
TO_DATE()
X
EXP()
X
REG_REPLACE
X
TO_DECIMAL()
X
FLOOR()
X
REPLACECHR()
X
TO_FLOAT()
X
GET_DATE_PART()
X
REPLACESTR()
X
TO_INTEGER()
X
IIF()
X
ROUND(NUMBER)
X
TRUNC(DATE)
X
IN()
-
RPAD()
X
TRUNC(NUMBER)
X
INITCAP()
X
RTRIM()
X
UPPER()
X
INSTR()
X
SIGN()
X
MD5()
X
ISNULL()
X
SIN()
X
VARIANCE()
X

Supported operators for Snowflake Cloud Data Warehouse V2 mappings

When you use pushdown optimization, the Secure Agent converts the expression in the transformation by determining equivalent operators in the database. If there is no equivalent operator, the Secure Agent processes the transformation logic.
The tables lists the operators that you can push down to Snowflake. Columns marked with an X indicate that the function can be pushed to Snowflake by using full pushdown optimization.
Operator
Pushdown
Operator
Pushdown
+
X
=
X
-
X
>=
X
*
X
<=
X
/
X
!=
X
%
X
AND
X
||
X
OR
X
>
X
NOT
X
<
X

Supported transformations for Snowflake Cloud Data Warehouse V2 mappings

When you configure pushdown optimization, the Secure Agent tries to push the configured transformation to Snowflake.
The following table summarizes the availability of transformations that you can push down to Snowflake. Columns marked with an X indicate that the transformation can be pushed to Snowflake by using full pushdown optimization. Columns marked with a dash (-) symbol indicate that the transformation cannot be pushed to the database.
Transformations
Pushdown
Guidelines
Aggregator
X
You can perform the following aggregate calculations:
  • - AVG
  • - COUNT
  • - MAX
  • - MIN
  • - MEDIAN
  • - SUM
  • - VARIANCE
When you configure an Aggregator transformation, you must use each of the incoming ports either in an aggregate function or in a group by field to define how to group data for aggregate expressions.
Expression
X
You can add an Expression transformation each to multiple sources followed by a join downstream in the mapping. Additionally, you can add multiple Expression transformations that branch out from a transformation and then branch in into a transformation downstream in the mapping.
Filter
X
-
Joiner
X
-
Lookup
Both connected and unconnected lookups are applicable.
Consider the following rules when you configure a lookup in a mapping:
  • - A lookup to Amazon S3, Google Cloud Storage, or Microsoft Azure Data Lake Storage Gen2 is supported only when the respective source used is Amazon S3, Google Cloud Storage, or Microsoft Azure Data Lake Storage Gen2.
  • - A lookup to Snowflake is supported for a mapping with an Amazon S3, Google Cloud Storage, Microsoft Azure Data Lake Storage Gen2, or Snowflake source.
When you configure an unconnected Lookup transformation, consider the following rules:
  • - You must select the Multiple Matches property value as Report error in the unconnected lookup properties for pushdown optimization to work.
  • - Do not configure an expression for an output received from an unconnected lookup.
For more information on specific rules for lookups, see Rules and guidelines for pushdown optimization.
Sorter
X
Update Strategy
-
You can instead use the update and upsert operations in the Target transformation to write to Snowflake.
Sequence Generator
X
Shared sequence is not applicable. You can add the Sequence Generator transformation only after the Expression transformation in the mapping.
The NEXTVAL() port in a Sequence Generator transformation can be linked directly to a single input port or multiple input ports in a Target transformation.
Router
X
When you configure a Router transformation, you must connect or map only one output group to the target transformation.
Union
X
Applicable only for Snowflake sources.
When you configure a Union transformation, if the source is Amazon S3, Google Cloud Storage, or Microsoft Azure Data Lake Storage Gen2, the mapping runs without pushdown optimization.

Supported features for Snowflake Cloud Data Warehouse V2 mappings

When you configure pushdown optimization, refer to the list of supported Snowflake Cloud Data Warehouse V2 properties in the Source, Target, and Lookup transformations.

Source properties

When you configure pushdown optimization, the mappings support the following properties for a Snowflake Cloud Data Warehouse V2 source:
The following source properties are not applicable:
You can configure pre-SQL and post-SQL in mappings enabled for source pushdown optimization that read from and write to Snowflake. Pre-SQL and post-SQL is not applicable for mappings enabled with full pushdown optimization.

Target properties

When you configure pushdown optimization, the mappings support the following properties for a Snowflake Cloud Data Warehouse V2 target:
The following target properties are not applicable:
Note: You can add multiple Snowflake Cloud Data Warehouse V2 targets in a mapping. The target can be the same Snowflake Cloud Data Warehouse V2 target table added multiple times or different Snowflake Cloud Data Warehouse V2 target tables.

Lookup properties

When you configure pushdown optimization, the mappings support the following properties for Snowflake Cloud Data Warehouse V2 connected and unconnected lookup:
The following lookup properties are not applicable:
Note: You can specify a parameter file in the mapping task to override the Snowflake Cloud Data Warehouse V2 source, lookup, and target connections and objects in a mapping.

Supported features for Amazon S3 V2 source

When you configure pushdown optimization, the mappings support the following properties for an Amazon S3 V2 source:

Supported features for Google Cloud Storage V2 source

When you configure pushdown optimization, the Google Cloud Storage V2 connection supports the following properties:
Server-side encryption is applicable.
When you configure pushdown optimization, the mappings support the following properties for a Google Cloud Storage V2 source:

Supported features for Microsoft Azure Data Lake Storage Gen2 source

When you configure pushdown optimization, the Microsoft Azure Data Lake Storage Gen2 connection supports the following properties:
When you configure pushdown optimization, the mappings support the following properties for a Microsoft Azure Data Lake Storage Gen2 source:
When you configure pushdown optimization, the mapping supports the following transformations:
For information about the configurations for the listed options, see the help for the Microsoft Azure Data Lake Storage Gen2 Connector.

Configuring pushdown optimization in a Snowflake Cloud Data Warehouse V2 mapping task

Perform the following steps to configure pushdown optimization in a mapping task:
    1. Create a mapping based on your requirement:
    2. Create a mapping task.
    1. a. Select the configured mapping.
    2. b. In the Pushdown Optimization section on the Schedule tab, set the pushdown optimization value to Full.
    3. c. Save the task and click Finish.
When you run the mapping task, the transformation logic is pushed to the configured database. To verify that the mapping was optimized, you can check the session log for the job. In Monitor, view the log for jobs.

Verify the pushdown query in the session log

To verify that the pushdown optimization was applied while running the mapping, you can check the session log for the job. In Monitor, view the log for jobs.
Check the queries in the session logs to verify if the mapping applied pushdown optimization.
For example, the following query is generated in the session log for a mapping enabled with full pushdown optimization:
In the example, the generated SQL includes both the Insert Into and Select queries pushed down to the database as a single statement.
If a failure is encountered while pushing down the mapping or while generating the pushdown query, the session log provides the details of the error. You can check the details to troubleshoot the error.
For example, the session log shows the following error details in the query:
When you do not enable pushdown optimization in a mapping, separate select and insert statements are generated for the read and write operations:

Configuring a custom query or an SQL override for the Snowflake source object

You can push down a custom query or an SQL override to Snowflake.
Before you run a task that contains a custom query as the source object or you configure an SQL override, you must set the Create Temporary View session property in the mapping task properties.
Note: If you do not set the Create Temporary View property, the mapping runs without pushdown optimization.
Perform the following task to set the property:
  1. 1. In the mapping task, navigate to the Pushdown Optimization section on the Schedule tab.
  2. 2. Select Create Temporary View.
  3. 3. Click Finish.

Configuring copy options as additional runtime parameters

When you configure pushdown optimization for a mapping, you can specify the copy command options to load data from Amazon S3 to Snowflake.
Specify the options in the Additional Write Runtime Parameters field in the Snowflake Cloud Data WarehouseV2 advanced target properties of the Target transformation.
When you specify multiple copy command options, separate each option with an ampersand &.
Note: The copy option MATCH_BY_COLUMN_NAME is not applicable.
For more information about the supported copy command options, see the Snowflake documentation at the following website: https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html

Rules and guidelines for pushdown optimization

Certain rules and guidelines apply when you enable a mapping for pushdown optimization to a Snowflake database.

Mapping with Snowflake Cloud Data Warehouse V2 source and target

Use the following rules and guidelines when you configure pushdown optimization in a mapping that reads from and writes to Snowflake Cloud Data Warehouse V2:

Mapping with Amazon S3 V2 source and Snowflake Cloud Data Warehouse V2 target

Use the following rules and guidelines when you configure pushdown optimization in a mapping that reads from an Amazon S3 V2 source and writes to a Snowflake Cloud Data Warehouse V2 target: