beamcookbook

View project on GitHub

Home

Getting Started Tutorial

Launch this in Cloud Shell

cloudshell launch-tutorial docs/java/tutorials/getting_started.md

Open Project Folder CD into tutorials folder

cd tutorials/java/

Generate Project Code

For this we can use a Maven Archetype to generate a Starter Project with a simple pipeline ready to modify.

mvn archetype:generate \
     -DarchetypeGroupId=org.apache.beam \
     -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-starter \
     -DarchetypeVersion=2.12.0 \
     -Dversion="0.1" \
     -DgroupId=com.gcp.cookbook.beam \
     -DartifactId=getting-started \
     -Dpackage=com.gcp.cookbook \
     -DinteractiveMode=false
cd getting-started/

Code Walkthrough

Open

StarterPipeline.java

To begin, let’s look at the pipline main method.

  • line 50

    Initializes the Pipeline

      Pipeline p = Pipeline.create(
      PipelineOptionsFactory.fromArgs(args).withValidation().create());
    
  • line 53

    Populate the pipeline with a list of words

    p.apply(Create.of("Hello", "World"))
    
  • line 54

    Simple Function to upper case each word, one word at a time.

      .apply(MapElements.via(new SimpleFunction<String, String>() {
        @Override
        public String apply(String input) {
          return input.toUpperCase();
        }
      }))
    
  • line 60

    Simple Function to output each word

      .apply(ParDo.of(new DoFn<String, Void>() {
        @ProcessElement
        public void processElement(ProcessContext c)  {
          LOG.info(c.element());
        }
      }));
    

Run Pipeline

Run Locally

mvn compile exec:java \
    -Dexec.mainClass=com.gcp.cookbook.StarterPipeline \
    -Dexec.args="--runner=DirectRunner"