Data processing/compute provisioning
In this tutorial we will learn and understand about Data processing and compute provisioning.
Creating a sole-tenant node template
Sole-tenant node templates are regional resources that specify properties for sole-tenant node groups. You must create a node template before you create a node group.
- Firstly, in the Google Cloud Console, go to the Sole-tenant nodes page.
- Secondly, click Node templates.
- Thirdly, click Create node template to begin creating a sole-tenant node template.
- Then, specify a Name for the node template.
- After that, specify a Region to create the node template in. You can use the node template to create node groups in any zone of this region.
- Next, specify the Node type for each sole-tenant node in the node group to create based on this node template.
- Optionally add Node affinity labels. Affinity labels let you logically group nodes and node groups. Further, when provisioning VMs, you can specify affinity labels on the VMs to schedule VMs on a specific set of nodes or node groups.
- Lastly, click Create to finish creating your node template.
Creating a sole-tenant node group
With the previously created sole-tenant node template, create a sole-tenant node group. However, a sole-tenant node group inherits properties specified by the sole-tenant node template and has additional values that you must specify.
- Firstly, in the Google Cloud Console, go to the Sole-tenant nodes page.
- Secondly, click Create node group to begin creating a node group.
- Thirdly, specify a Name for the node group.
- Fourthly, specify the Region for the node group to display the available node templates in that region.
- Then, specify the Zone within the region to create the node group in.
- After that, specify the Node template to create the node group from. The selected node template is applied to the node group.
- Choose one of the following for the Autoscaling mode for the node group autoscaler:
- Don’t configure autoscale
- Autoscale
- Autoscale only out
- Now, specify the Number of nodes for the group. If you enable the node group autoscaler, either specify a range for the size of the node group, or, specify the number of nodes for the group.
- After that, specify the Maintenance policy for the sole-tenant node group to one of the following values. The maintenance policy lets you configure the behavior of VMs on the node group during host maintenance events.
- Default
- Restart in place
- Migrate within node group
- Lastly, click Create to finish creating the node group.
Provisioning a sole-tenant VM
After creating a node group based on a previously created node template, you can provision individual VMs on a sole-tenant node group. However, to provision a VM on a specific node or node group that has affinity labels that match those you previously assigned to the node template. Then, follow the standard procedure for creating a VM instance, and assign affinity labels to the VM.
- Firstly, in the Google Cloud Console, go to the Sole-tenant nodes page.
- Secondly, click Node groups.
- Thirdly, click the Name of the node group to provision a VM instance on. Then, to provision a VM on a specific sole-tenant node, click the name of the specific sole-tenant node to provision the VM.
- Fourthly, click Create instance to provision a VM instance on this node group, note the values automatically applied for the Name, Region, and Zone, and modify those values as necessary.
- After that, select a Machine configuration by specifying the Machine family, Series, and Machine type.
- Modify the Boot disk, Firewall, and other settings as necessary.
- Now, click Sole Tenancy, note the automatically assigned Node affinity labels, and use Browse to adjust as necessary.
- Click Management, and for On host maintenance, choose one of the following:
- Migrate VM instance (recommended)
- Terminate
- Choose one of the following for the Automatic restart:
- On (recommended)
- Off
- Lastly, click Create to finish creating your sole-tenant VM.
Deleting a node group
If you need to delete a sole-tenant node group, first remove any VMs from the node group.
- Firstly, go to the Sole-tenant nodes page.
- Secondly, click the Name of the node group to delete.
- Thirdly, for each node in the node group, click the node’s name and delete individual VM instances on the node details page, or follow the standard procedure to delete an individual VM. However, to delete instances in a managed instance group, delete the managed instance group.
- Next, after deleting all VM instances running on all nodes of the node group, return to the Sole-tenant nodes page.
- Then, click Node groups.
- After that, select the name of the node group you need to delete.
- Lastly, click Delete.
Performance Strategies
- Firstly, Evaluate performance requirements. Determine the priority of your various applications and what minimum performance you require of them.
- Secondly, Use scalable design patterns. Improve scalability and performance with autoscaling, compute choices, and storage configurations.
- Lastly, Identify and implement cost-saving approaches. Evaluate cost for each running service while associating priority to optimize for service availability and cost.
Use autoscaling and data processing
Use autoscaling so that as load increases or decreases, the services add or release resources to match.
Compute Engine autoscaling
Managed instance groups (MIGs) let you scale your stateless apps on multiple identical VMs, so that a group of Compute Engine resources is launched based on a single template. Further, Autoscaling policies include scaling based on CPU utilization, load balancing capacity, Cloud Monitoring metrics, or, for zonal MIGs, by a queue-based workload, like Pub/Sub.
Google Kubernetes Engine autoscaling
You can use the cluster autoscaler feature in Google Kubernetes Engine (GKE) to manage your cluster’s node pool based on varying demand of your workloads. However, cluster autoscaler increases or decreases the size of the node pool automatically, based on the resource requests of Pods running on that node pool’s nodes.
Serverless autoscaling
Serverless compute options include Cloud Run, App Engine, and Cloud Functions, each of which provides autoscaling capabilities. Further, use these serverless options to scale your microservices or functions.
Data processing
Dataproc and Dataflow offer autoscaling options to scale your data pipelines and data processing. Further, use these options to allow your pipelines to access more computing resources based on the processing load.
Use GPUs and TPUs to increase performance
Google Cloud provides options to accelerate the performance of your workloads. However, you can use these specialized hardware platforms to increase your application and data processing performance.
Graphics Processing Unit (GPU)
Compute Engine provides GPUs that you can add to your virtual machine instances. However, you can use these GPUs to accelerate specific workloads on your instances such as machine learning and data processing.
Tensor Processing Unit (TPU)
A TPU is specifically designed as a matrix processor by Google for machine learning workloads. Further, TPUs are best suited for massive matrix operations with a large pipeline, with significantly less memory access.
Analyze your costs and optimize
- The first step in optimizing your cost is to understand your current usage and costs. Google Cloud provides an Export Billing to BigQuery feature that provides a detailed way to analyze your billing data. Moreover, you can connect BigQuery to Google Data Studio to perform visual analysis, or other third-party business intelligence (BI) tools like Tableau, Qlik, or Looker.
- Secondly, sustained use discounts are automatic discounts for running specific Compute Engine resources for a significant portion of the billing month.
- Next, committed use discounts are ideal for workloads with predictable resources needs. When you purchase a committed use contract, you purchase a certain amount of vCPUs, memory, GPUs, and local SSDs at a discounted price. However, tis is in return for committing to paying for those resources for 1 year or 3 years.
Reference: Google Documentation, Doc 2