Deep Dive: Google BigQuery Connector

Document created by rodrigo.delatorre Employee on May 3, 2018Last modified by Adam Arrowsmith on Jun 5, 2018
Version 20Show Document
  • View in full screen mode

Dell Boomi recently released a Google BigQuery connector that makes it easy to integrate with the Google BigQuery enterprise data warehouse. Let’s see what it takes to get up and running with the Google BigQuery integration.

 

 

Overview

The new Google BigQuery connector allows you to manipulate tables and views in BigQuery. With the supported operations, you can insert new data into an existing table, update table or view schemas and characteristics, and query the same tables. Based on the Google BigQuery API, you can only insert new data into tables.

 

 

 Get the example process used below from the Process Library here.

 

 

Configure the Google BigQuery connection

The BigQuery connection uses OAuth 2.0 for authentication and authorization.

 

 

 

Notes

  • On the Connection tab, the Project ID within Google BigQuery is required. This ID is a unique identifier for your project and is billed for any jobs you create.
  • On the OAuth 2.0 tab, you must enter the Client ID and Client Secret. You can retrieve this information in the Google Console and then be able to generate the Access Token. The default scope provided gives access to most common permissions, but this can be changed according to user's needs.

 

Operations

Streaming Insert

 

Use the Streaming Insert operation to insert new rows of data into a Google BigQuery table. The input is JSON, and only tables can be updated with this operation. Based on the Google BigQuery API, you can only add new rows of data in tables.

The following image shows a simple process that adds a row in a table using the Streaming Insert operation:

 

Steps

1. Using the Message shape, define a multiple row JSON with the new rows to be added in the table.

 

 

2. Using a Map shape, map the JSON created in Step 1 with the Input profile of the Streaming Insert operation:

 

 

3. When performing an operation import, the Dataset Id must be provided:

 

 

As a result, available tables display for selection.

 

 

 

After selecting the table, the operation page appears as follows. There are several options that you can use within the operation configuration. In this article, we will focus on the basic scenario.

 

Note: After the operation is executed, you won't immediately see the changes in BigQuery. This may take some time, and a warning popup appears.

 

Run job and Get Query Results

 

Jobs are objects that manage asynchronous tasks such as running queries, loading data, and exporting data. You can run multiple jobs concurrently in BigQuery, and completed jobs are listed in the Jobs collection. After the Run Job operation executes, a JSON profile is provided, which provides the Job ID that will execute the Get Query operation.

The following image shows the steps to execute the Run Job operation, obtain the Job ID, and use it in the Get Query Results operation.

 


Steps

1. Provide a JSON as an input for the Run job, as shown in the following example. In this example, a basic SQL sentence is used, but you may use as many variants as the API supports.

 

2. On Import, the Run Job operation only has one associated object type, which is Query. Select this to load the profiles.

 

 

 

3. Use a Map shape to map the response profile obtained from the execution of the Run Job operation with the Get Query operation request profile. This is just one of the many ways to perform this operation.

 

 

6. Similar to the Run Job operation, the only available object type to import is Query. This must be done to load the Request profile.

 

After the execution of the operation, in response, user will have one document per row that matches the result of the execution of the SQL statement from Step Note that, the operation will wait until the Job is done in BigQuery and the use of the properties Timeout and Max Result could extend the operation execution time.

 

Update

 

Use the Update operation to update the table and view schema, labels, description, and expiration time.

 

Steps

1. Provide a JSON as an input for the Update operation, as shown in the following example. Note that no reference of the table/view that will be updated was added because this information will be obtained from the Operation configuration.

 

2. At the operation level during the import, select the table to be updated. In this case, the JSON provided in Step 1 works. If you select “DYNAMIC TABLE”, the table reference must be included in the JSON.

 

3. With the use of a Decision shape, check that the status code is the appropriate one (200).

 

Execute the operation and verify in BigQuery that the description is updated. You can use the same steps or JSON to update a view.

Attachments

    Outcomes