Pipeline Lifecycle, creation and Transform Google Professional Data Engineer GCP
Pipelien Creation
To construct a pipeline using the Beam SDKs
- Create a Pipeline object.
- Use a Read or Create transform to create one or more PCollections for pipeline data.
- Apply transforms to each PCollection.
- Write or otherwise output the final, transformed PCollections.
- Run the pipeline.
- Pipeline execution is separate from Apache Beam program’s execution; and is executed by a pipeline runner.
- can specify the pipeline runner and other execution options
Transforms
- Element-wise transforms operate on individual elements within PCollection .
- Similar to MapReduce.
- execute transformations by invoking a ParDo operation
Cancelling
- Canceling a job causes a near immediate halt of execution,
- good for idempotent pipelines
- If consuming data destructively, may result in lost data.
Google Professional Data Engineer (GCP) Free Practice TestTake a Quiz