Data Integration Elastic Administration > Elastic configurations > Google Cloud properties
  

Google Cloud properties

To configure properties in an elastic configuration, click New Elastic Configuration or click the name of the configuration that you want to edit on the Elastic Clusters page.
The basic properties describe the elastic configuration and define the cloud platform to host the elastic cluster. To configure the cluster, configure the platform, advanced, and runtime properties.

Basic configuration

The following table describes the basic properties:
Property
Description
Name
Name of the elastic configuration.
Description
Description of the elastic configuration.
Runtime Environment
Runtime environment to associate with the elastic configuration. The runtime environment can contain only one Secure Agent. A runtime environment cannot be associated with more than one configuration.
Cloud Platform
Cloud platform that hosts the elastic cluster.
Select Google Cloud Platform (GCP).

Platform configuration

The following table describes the platform properties:
Property
Description
Region
Region in which to create the cluster. Use the drop-down menu to view the regions that you can use.
Master Instance Type
Instance type to host the master node. Use the drop-down menu to view the instance types that you can use.
Master Service Account
Service account to attach to the master node.
Worker Instance Type
Instance type to host the worker nodes. Use the drop-down menu to view the instance types that you can use.
Number of Worker Nodes
Number of worker nodes in the cluster. Specify the minimum and maximum number of worker nodes.
Worker Service Account
Service account to attach to the worker nodes.
Private Cluster
Creates an elastic cluster in which cluster resources have only private IP addresses.
If you choose to create a private cluster, you must specify the VPC and subnet in the advanced properties. The Secure Agent must be in the same VPC network or a VPC network that can connect to the VPC that you specify in the advanced properties.
Availability Zones
List of availability zones where cluster nodes are created. The master node is created in the first availability zone in the list. If multiple zones are specified, the cluster nodes are created across the specified zones.
The zones must be unique and be within the specified region.
Disk Size
Size of the persistent disk to attach to a worker node for temporary storage during data processing. The disk size must be between 50 GB and 16 TB.
Cluster Shutdown
Cluster shutdown method. You can select one of the following cluster shutdown methods:
  • - Smart shutdown. The Secure Agent stops the cluster when no job is expected during the defined idle timeout, based on historical data.
  • - Idle timeout. The Secure Agent stops the cluster after the amount of idle time that you define.
Mapping Task Timeout
Amount of time to wait for a mapping task to complete before it is terminated. By default, a mapping task does not have a timeout.
If you specify a timeout, a value of at least 10 minutes is recommended. The timeout begins when the mapping task is submitted to the Secure Agent.
Staging Location
Location on Google Cloud Storage for staging data.
The location name must start with gs://.
Log Location
Location on Google Cloud Storage to store logs that are generated when you run an elastic job.
The location name must start with gs://.

Advanced configuration

The following table describes the advanced properties:
Property
Description
VPC
Google Cloud Virtual Private Cloud (VPC) in which to create the cluster. The VPC must be in the specified region.
If you do not specify a VPC, the agent creates a VPC on your Google Cloud account based on the region and the availability zones that you select.
Subnet
Subnets in which to create cluster nodes. Use a comma-separated list to specify the subnets.
Required if a VPC is specified. Each subnet must be in a different availability zone within the specified VPC.
If you do not specify a VPC, you cannot specify subnets. You must provide availability zones instead of subnets.
IP Address Range
CIDR block that specifies the IP address range that the cluster can use.
For example: 10.0.0.0/24
Initialization Script Path
Google Cloud Storage file path of the initialization script to run on each cluster node when the node is created. Use the format: <bucket name>/<folder name>. The script can reference other init scripts in the same bucket or in a subdirectory.
The script must be a bash script.
Cluster Labels
Labels to apply to cluster nodes. Each label has a key and a value. The key can be up to 63 characters long.
You can list a maximum of 55 labels.
The Secure Agent also assigns default labels to cloud resources. The default labels do not contribute to the limit of 55 labels.

Runtime configuration

The following table describes the runtime properties:
Property
Description
Encrypt Data
Indicates whether temporary data on the cluster is encrypted.
Runtime Properties
Custom properties to customize the cluster and the jobs that run on the cluster.

Propagating labels to cloud resources

The Secure Agent propagates labels to cloud resources based on the cluster labels that you specify in an elastic configuration.
The agent propagates labels to the following resources:
If your enterprise follows a tagging policy, make sure to manually assign labels to other cloud resources.
Note: The Secure Agent propagates labels only to cloud resources that the agent creates. For example, if you create a network and specify the network in an elastic configuration, the agent does not propagate cluster labels to the network.

Initialization scripts

Cluster nodes can run an initialization script based on an init script path that you specify in an elastic configuration. Each node runs the script when the node is created, and the script can reference other init scripts.
You might want to run an init script to install additional software on the cluster. For example, your enterprise policy might require each cluster node to contain monitoring and anti-virus software to protect your data.
Consider the following guidelines when you create the init script:
The init script path must be in cloud storage. You can place the scripts in a unique path on the cloud storage system, or you can place the scripts in the staging location.

Initialization script failures

When an initialization script fails on a cluster node, it can have a significant impact on the elastic cluster. An init script failure can prevent the cluster from scaling up or cause the Secure Agent to terminate the cluster.
Note the impact that an init script failure can have in the following situations:
Failure during cluster creation
If the init script fails on any node during cluster creation, the Secure Agent terminates the cluster.
Resolve the issues with the init script before running a job to start the cluster again.
Failure during a scale up event
If the init script fails on a node that is added to the cluster during a scale up event, the node fails to start and the cluster fails to scale up. If the cluster attempts to scale up again and the node continues to fail to start, it adds to the number of accumulated node failures until the Secure Agent terminates the cluster.
Failure while recovering a master node
If you enable high availability in an AWS environment and the init script fails on a recovered master node, the node fails to start and contributes to the number of accumulated node failures over the cluster lifecycle.
Accumulated failures over the cluster lifecycle
During the cluster lifecycle, the Secure Agent tracks the number of accumulated node failures that occur due to an init script within a certain time frame. If the number of failures is too high, the agent terminates the cluster.
Find the log files for the nodes where the init script failed and use the log files to resolve the failures before running a job to start the cluster again.

Updating the runtime environment or the staging location

Update the runtime environment or the staging location in an elastic configuration based on the status of the Secure Agent and the elastic cluster.
To update the runtime environment or the staging location, perform one of the following tasks based on the status of the Secure Agent and the elastic cluster:
The Secure Agent and the elastic cluster are running.
If the agent and the cluster are running, complete the following tasks:
  1. 1. Update the runtime environment or the staging location in the elastic configuration.
  2. 2. Stop the cluster when you save the configuration.
The Secure Agent is unavailable or the elastic cluster cannot be reached.
If the agent is unavailable or the cluster cannot be reached, complete the following tasks:
  1. 1. Run the command to delete the cluster or make sure that all cluster resources are deleted by logging in to your account on the cloud platform. For information about commands, see Command reference.
  2. 2. Update the runtime environment or the staging location in the elastic configuration.
  3. 3. Disable the cluster when you save the configuration.
Note: If you update the runtime environment, the new Secure Agent will create a new elastic cluster with a different cluster ID.