Data Organization Google Professional Data Engineer GCP
- Every row in Bigtable is indexed by a single row key.
- Row keys are byte strings that may be up to 64 KB
- Row keys enables to retrieve several related rows quickly as a single, contiguous scan over the row key.
- Each row may contain thousands of columns
- Bigtable only scans the row key when performing lookups.
- Reads and writes are performed at the row level.
- Data is stored in scalable tables, each of which is a sorted key/value map.
- Each row describes a single entity,
- Columns contain individual values for each row.
- Each row is indexed by a single row key
- Columns if related are grouped into a column family.
- Column is identified by column family and a column qualifier.
- Each row/column intersection can contain multiple cells, or versions, at different timestamps
- Tables are sparse; if a cell does not contain any data, it does not take up any space.
For example, suppose you’re building a social network for United States presidents—let’s call it Prezzy. Each president can follow posts from other presidents. The following illustration shows a Cloud Bigtable table that tracks who each president is following on Prezzy:
In above figure,
- The table has single column family – “follows” having many column qualifiers.
- Above, uses column qualifiers as data as Bigtable handles sparseness and will be able to add new ones quickly.
- Row key is username and assuming they are evenly spread, it gives quick access and update
Google Professional Data Engineer (GCP) Free Practice TestTake a Quiz