Questions tagged [google-bigquery]

1

votes
1

answer
167

Views

Biq Query regex_replace error (\? vs \\?)

I am having issues understanding what's wrong with this regex: \?.* select REGEXP_REPLACE(longstringcolumn, '\?.*', '') as newstring from tablename My example string aka 'longstring' has '?' character, and I am trying to match everything trailing '?' (including '?' itself). I have checked my regexp...
geekidharsh
1

votes
2

answer
142

Views

How to query a Google BigQuery table and remove duplicates based on a subset of columns?

I have a query that joins two google tables and produces a table with 6 columns (a, b, c, d, e, f). Next, I move that table to a google bucket and then download that google bucket to a bunch of CSV's. Finally I insert those CVS's into a postgres database table which has 2 primary keys, a and b. The...
user1367204
1

votes
2

answer
1.4k

Views

BigQuery: Syntax error: Unexpected keyword LEFT

I got this error of 'Syntax error: Unexpected keyword LEFT' from the following SQL (standard SQL) in BigQuery: select left(cast(ts as string), 16) from temp.loc limit 1; 'ts' is a timestamp field and I wanted to get upto minutes of timestamp. Any idea?
kee
1

votes
2

answer
107

Views

Bigquery Authorized View Cost Billing Account

When there are two projects under two different billing accounts, and there is authorized view across the two projects, which billing account will be billed for the query cost on the views? Scenario: Project A contains the views use Project B's dataset which contains the actual data. When analysts r...
Chaoming Li
1

votes
1

answer
450

Views

BigQuery: Questions on Delete and Update rows using nodejs

I've found a lot of node.js examples to query and insert data into BigQuery but didn't find any example nor API description on how to delete and update rows in the database. I am aware of the limitations (30 minutes since last change, etc.). The only tip I've found I got from vscode bigQuery.dataset...
JLCDev
1

votes
2

answer
59

Views

How to pass query statement to bigquery in node.js environment

During the big query, the parameters of the function in the SQL statement I want to update the result of a sql statement by inserting it as @ variable name. However, there is no method to support node.js. For python, there are methods like the following example. You can use the function's parameters...
황인규
1

votes
2

answer
61

Views

Is there a quicker way to initialise a BigQuery client?

Using the recommended way to initialise a BigQuery client from the google documentation at Quickstart: Using Client Libraries takes 15 seconds to complete. This seems very slow - is there a quicker way? import com.google.cloud.bigquery.BigQuery; import com.google.cloud.bigquery.BigQueryOptions; publ...
chris
1

votes
3

answer
59

Views

Wildcard table matches with _TABLE_SUFFIX and sub-query

The _TABLE_SUFFIX feature is great and exactly what I was looking for to solve my problem - however it is scanning all of the data matched by the wildcard when I use a sub-query to determine which tables to match on. If you do an operation such as = or BETWEEN or IN with a set of values on _TABLE_SU...
Alexander Baumann
1

votes
3

answer
66

Views

Changing query to avoid “Aggregations of aggregations are not allowed” in Bigquery

Given user and order tables, I need to count users who made their first order on the next day after registration date. I managed to list such users with the following query: SELECT users.first_name as first_name, users.last_name as last_name, users.registration_date as registration_date, min(orders...
Vadim Tikanov
1

votes
1

answer
50

Views

Calculate distance on a polyline of a road between 2 lat/lons

This is not distance as the crow flies. I'm looking for an API like this: distanceMiles = calculateMilesBetweenPointsAlongRoad(LatLon1, LatLon2, RoadPolyline) I have a road represented as a polyline. As a vehicle moves on this road, I capture lat/lons. I want to calculate the distance the vehicle tr...
Jason
1

votes
1

answer
74

Views

How to change the col type of a BigQuery repeated record

I'm trying to change a col type of a repeated record from STRING to TIMESTAMP. There are a few suggestions from BQ docs here (manually-changing-schemas). However, I'm running into issues with each of the recommended suggestions. Here is an example schema: { 'name' => 'id', 'type' => 'STRING', 'mode'...
harlow
1

votes
1

answer
29

Views

Find all rows with Null value(s) in a specific column(s) in Big Query

Is there a way to improve the following? I need to count all rows with NULL value(s) in a specific column. SELECT SUM(IF(column1 IS NULL, 1, 0)) AS column1, SUM(IF(column2 IS NULL, 1, 0)) AS column2 FROM `dataset.table`;
spicyramen
0

votes
2

answer
24

Views

SQL multiple AS columns from WHERE

I have a table name | age | city ------------- joe | 42 | berlin ben | 42 | munich anna | 22 | hamburg pia | 50 | berlin georg | 42 | munich lisa | 42 | berlin Now I would like to get all 42 year old in different columns by city berlin | munich ------------- joe | ben lisa | georg So I would need so...
user987875
1

votes
2

answer
1.1k

Views

BigQuery DeDuplication on two columns as unique key

We use BigQuery religiously and have two tables that essentially were updated in parallel by different process. The problem I have we don't have a unique identifier for tables and the goal is to combine the two tables with zero duplication if possible.. The unique identifier is two columns combined....
Dovy
0

votes
1

answer
143

Views

Possibility of updating data in real-time on a client

I have the following scenario that I was wondering if it's possible/feasible to implement. I apologize if this is considered an overly 'broad' question, but I think SO would be the best place to ask this. Let us suppose I have a website and I want to display a graph to an end-user. For the purposes...
David542
1

votes
2

answer
611

Views

Uploading JSON to Bigquery unspecific error

I am just getting started with the python BigQuery API (https://github.com/GoogleCloudPlatform/google-cloud-python/tree/master/bigquery) after briefly trying out (https://github.com/pydata/pandas-gbq) and realizing that the pandas-gbq does not support RECORD type, i.e. no nested fields. Now I am try...
Fabian Bosler
1

votes
2

answer
47

Views

How can I extract table defintion from BigQuery

I want to duplicate specific table schema without the data. Basically create a clean table with different name. Say original table orders as: a integer b string c float I want to create: orders-copy as: a integer b string c float BigQuery offers the COPY option from the UI but this also copy the d...
jack
1

votes
2

answer
251

Views

How to get max value of column values in a record ? (BigQuery)

I wanna get max value of each rows, not max value of a field. For example, when I have a sample_table like this: sample_table |col1|col2|col3| |--------------| | 1 | 0 | 0 | | 0 | 2 | 0 | | 2 | 0 | 0 | | 0 | 0 | 3 | And the query and result I want is something like this: query SELECT SOM...
Taichi
1

votes
1

answer
63

Views

Cosine similarity between pair of arrays in Bigquery

I have created a table that has a pair of IDs and coordinate fro each of them so that I can calculate pairwise cosine similarity between them. The table looks like this The number of dimension for the coords are currently 128, but it can vary. But the number dimensions for a pair of ID are always sa...
Syed Arefinul Haque
1

votes
3

answer
162

Views

Get a massive csv file from GCS to BQ

I have a very large CSV file (let's say 1TB) that I need to get from GCS onto BQ. While BQ does have a CSV-loader, the CSV files that I have are pretty non-standard and don't end up loading properly to BQ without formatting it. Normally I would download the csv file onto a server to 'process it' and...
David542
1

votes
2

answer
46

Views

Create an array with NULL values/0 and find array length excluding null/0

I want to find the number of columns in a range in each row which has non-null and >0 value. I have done this currently using case when statements or IF-ELSE. But the number of columns that i have to now consider has increased and with that the number of case statements too. So i wanted to create an...
Raj
1

votes
1

answer
219

Views

List all the tables in a dataset in bigquery using bq CLI and store them to google cloud storage

I have around 108 tables in a dataset. I am trying to extract all those tables using the following bash script: # get list of tables tables=$(bq ls '$project:$dataset' | awk '{print $1}' | tail +3) # extract into storage for table in $tables do bq extract --destination_format 'NEWLINE_DELIMITED_JSON...
Syed Arefinul Haque
1

votes
2

answer
44

Views

Counting the occurrence of a substring from a delimited field

I have some data that looks like: Sequence, length abc, 1 bat, 1 abc > abc, 2 abc > bat, 2 ced > ced > ced > fan, 4 I'm trying to see the frequency of various strings as a new column to this data. For example: Sequence, length, count_of_ced abc, 1, 0 bat, 1, 0 abc > abc, 2, 0 abc > bat, 2, 0 ced > c...
AI52487963
1

votes
2

answer
57

Views

Bigquery: Is there a way to round a timestamps UP or DOWN to the NEAREST minute?

I've been trying to round UP and DOWN to the NEAREST minute in Bigquery. Does anyone know the best function and method to achieve this? user_id | created_at ------------------------------------- 14451 | 2019-01-31 04:51:28 UTC 14452 | 2019-01-31 04:51:31 UTC 14453 | 2019-01-31...
Livewire
1

votes
3

answer
45

Views

How can you figure out if Column A contains something from Column B?

I've been trying to figure out a way to grab information from Table A Column A compared to Table B Column A, for example: TableA Name abcd_1234_efgh zxcdde_gets_3214_ jkil_uelso_5555_aseil uuuu_kkkk_iiii_3333 TableB ID 1234 3214 5555 3333 I've tried doing an INNER JOIN...
Maykid
1

votes
2

answer
75

Views

Get number of rows in a BigQuery table (streaming buffer)

I am doing inserts via Streaming. In the UI, I can see the following row counts: Is there a way to get that via the API? Current when I do: from google.cloud import bigquery client = bigquery.Client() dataset = client.dataset('bqtesting') table = client.get_table(dataset.table('table_streaming')) ta...
David L
1

votes
2

answer
85

Views

Flatten nested JSON string to different columns in Google BigQuery

I have column in one of the BigQuery table which looks like this. {'name': 'name1', 'last_delivered': {'push_id': 'push_id1', 'time': 'time1'}, 'session_id': 'session_id1', 'source': 'SDK', 'properties': {'UserId': 'u1'}} Is there any was to get the output like this in GBQ ?? (basically flatten the...
Munagala
1

votes
1

answer
38

Views

How to extract separate values from GeoJSON in BigQuery

I have a GeoJSON string for a multipoint geometry. I want to extract each of those points to a table of individual point geometries in BigQuery I have been able to achieve point geometry for one of the points. I want to do it for all the others as well in a automated fashion. I've already tried conv...
Pranay Nanda
1

votes
4

answer
37

Views

How do you query an array in Standard SQL that meets a certain conditional?

I am trying to pull records whose arrays only meet a certain condition. For example, I want only the results that contain 'IAB3'. Here is what the table looks like Table Name: bids Column Names: BidderBanner / WinCat Entries: 1600402 / null 1911048 / null 1893069 / [IAB3-11, IAB3] 1214894 / IAB3 How...
kaecvtionr
1

votes
3

answer
49

Views

Find maximas and minima of time series values using SQL

I have a certain set of index values that increase and decrease over time . I wish to identify the time periods during which values rise and values fall. The data looks like this: I tried partitioning the values by the range and I definitely don't think I'm doing it right. Here's the query I wrote w...
Pranay Nanda
1

votes
2

answer
40

Views

Using the append model to do partial row updates in BigQuery

Suppose I have the following record in BQ: id name age timestamp 1 'tom' 20 2019-01-01 I then perform two 'updates' on this record by using the streaming API to 'append' additional data -- https://cloud.google.com/bigquery/streaming-data-into-bigquery. This i...
David542
0

votes
1

answer
12

Views

Connecting Spreadsheet to BigQuery

I want to connect a Google Spreadsheet to a new BigQuery table that populates and update the data automatically. I'm using this tutorial to do the setup. My problem, I had to configure each column manually and the table went empty so I have to query it to another table to bring the data. I'm not exp...
Igor Souza
1

votes
1

answer
91

Views

Loading Avro Data into BigQuery via command-line?

I have created an avro-hive table and loaded data into avro-table from another table using hive insert-overwrite command.I can see the data in avro-hive table but when i try to load this into bigQuery table, It gives an error. Table schema:- CREATE TABLE `adityadb1.gold_hcth_prfl_datatype_accepten...
Vishwanath Sharma
1

votes
0

answer
286

Views

Google Cloud Dataprep - Scan for multiple input csv and create corresponding bigquery tables

I have several csv files on GCS which share the same schema but with different timestamps for example: data_20180103.csv data_20180104.csv data_20180105.csv I want to run them through dataprep and create Bigquery tables with corresponding names. This job should be run everyday with a scheduler. Righ...
Pepperoni Papaya
1

votes
0

answer
305

Views

Dynamically write to tables in Dataflow

Working on a pipeline in Dataflow. I need write values to multiple big query table where the desired table names are values in a PCollection. For example with class Data as: public class Data{ public List tableName; public String id; public String value; } I will have a PCollection and i would like...
shockawave123
1

votes
0

answer
119

Views

Can't run App Engine locally with BigQuery

When I import BigQuery: from google.cloud import bigquery I get the following error: Traceback (most recent call last): File '/Users/manuelgodoy/Projects/Klein/src/application/storage/bigquery_models.py', line 6, in from google.cloud import bigquery File '/Library/Frameworks/Python.framework/Versio...
Manuel Godoy
1

votes
1

answer
108

Views

Options to load data into Big Query from Google Cloud Storage pro-grammatically in Java?

I have been searching for loading data into Big Query programmatically from Google Cloud Storage. I have done this manually by taking backup of my Google Cloud Storage of one Kind and dumping it into the BigQuery Table and was able to retrive data in android as well. The only problem i am facing is...
Rahul Verma
1

votes
1

answer
51

Views

How to integrate Google Cloud Platform into my company's iOS app [closed]

My company currently stores real time manufacturing data (< 10 GigaBytes) locally in Microsoft SQL Server. We would like to push this data to the cloud and serve it to US clients in an iOS app, preferably in real-time. I have experience with Firebase and Cloud Functions, but not Google Cloud. What...
JGuo
1

votes
0

answer
310

Views

Is there a tool to efficiently export a BigQuery table to BigTable?

Is there a tool to efficiently export a BigQuery table or query result to a BigTable table? Ideally this would be a single Dataflow program that did a BigQuery query on a table and wrote the results to a BigTable table, with a designated key column, and corresponding column names for all the fields...
Henry Minsky
1

votes
1

answer
345

Views

Updating Repeated Array Struct BigQuery

I have (apparently with 104559 rows affected) successfully updated an element of my session_user repeated field (within the table definition see below) with: UPDATE genderfitnessdev.gfa_talend_dev.gfa_employment_cu set session_user = ARRAY ( SELECT AS STRUCT * REPLACE('399975' as level_0) FROM UNNES...
Jules Boult

View additional questions