BigQuery Best Practices Google Professional Data Engineer GCP

  1. Home
  2. BigQuery Best Practices Google Professional Data Engineer GCP

For Controlling costs

  • Avoid SELECT *
  • Sample data using preview options
  • Price queries before running them
  • Limit query costs by restricting the number of bytes billed
  • LIMIT doesn’t affect cost
  • View costs using a dashboard and query audit logs
  • Partition data by date
  • Materialize query results in stages
  • Consider the cost of large result sets
  • Use streaming inserts with caution

 

For Query

  • Avoid repeated joins and subqueries
  • Carefully consider materializing large result sets
  • Use a LIMIT clause with large sorts
  • Denormalize data whenever possible
  • Avoid excessive wildcard tables
  • Reduce data before using a JOIN
  • Do not treat WITH clauses as prepared statements
  • Avoid tables sharded by date

 

 

For Storage

  • Use the expiration settings to remove unneeded tables and partitions
  • Take advantage of long-term storage
  • Use the pricing calculator to estimate storage costs

 

Menu