Questions tagged [prestodb]

1

votes
1

answer
465

Views

Unable to query parquet data with nested fields in presto db

I have data, some of each includes nests columns (arrays of arrays of objects), saved as PARQUET in Spark 2.2. Now I'm trying to access this data externally with presto and I get following exception when I'm trying to access any nested column. com.facebook.presto.spi.PrestoException: Error opening...
mixermt
1

votes
1

answer
49

Views

How to purge the Presto logs by configuration?

Dear friends and collegs we installed before one month the preso cluster we are very surprised about the logs from presto we see that logs are not purged from /home/presto/data/var/log so logs increase the use size on the disk each week for now all logs are around ~100M and on the next month it wil...
Judy
1

votes
2

answer
22

Views

SQL syntax for removing a specific row [time] from a specific group [symbol]

I'm running up against the edge of my SQL Query knowledge and could use a point in the right direction. (I am using Presto, but ideally that shouldn't matter because Presto uses common SQL syntax.) What I would like to do is always exclude the 9:31:00 [QuoteTime] ONLY on the 'VIX' Symbol. If possibl...
Daniel Sims
1

votes
3

answer
388

Views

Format int as date in presto SQL

I have an integer date column 'date_created' storing values like... 20180527, 20191205, 20200208 And am wondering what the best way to parse as a date is so I could do something like this in a query... select * from table where formatted(date_created) > formatted(date_created) - 90 (to return everyt...
d3wannabe
1

votes
1

answer
67

Views

Is Presto DB is a data store for storing data?

I am new to work on Presto. I have some doubts regarding Presto. Whether Presto is a data store(database)? If it is a query engine ? Whether there is any common query syntax for accessing Hive, SQL, Cassandra data using connectors or it will accept all data source queries based on connectors ? Whe...
Clinton Prakash
1

votes
0

answer
181

Views

Presto not using index with mongodb query

I have set up a Presto (0.191) instance with one Coordinator and one Worker node and want to do some data analysis with data from several sources like mysql and mongodb. Wenn I do a query on the mongodb table 'earnings' Presto seems to do a full table scan and not have the Connector use the index on...
Normalo
1

votes
1

answer
588

Views

Spark Small ORC Stripes

We use Spark to flatten out clickstream data and then write the same to S3 in ORC+zlib format, I have tried changing many settings in Spark but still the resultant stripe sizes of the ORC file getting created are very small (
Rajiv
1

votes
2

answer
231

Views

Ideas to encrypt/obfuscate results in a Presto query?

Scenario: I have a Presto table that I will be querying and sending results to various semi-trusted parties. These semi-trusted parties will analyze the data and return results back to me. Some of the data in this table is 'semi-private'— nothing that could cause real harm if it were discovered, b...
Jim Heising
1

votes
0

answer
34

Views

How to automatically create a table on a predefined interval (e.g. month) using current_date()

I would like to automatically create a table, in my postgresql db, every month using stored data. Every time the query runs I would like the table name to change dynamically (e.g table_JAN2018, table_FEB2018, table_MAR2018 etc) so as a new table is created with the new data of that month while the o...
Elli Chatzistamou
1

votes
0

answer
251

Views

Unable to run Presto LDAPS from SQL workbench

I am unable to execute any query from Sql-workbench/J for AWS-EMR presto which is Ldaps(SSL/secureLDAP) enabled. Following are the details: Connection String: jdbc:presto://hostname:8446/hive?SSL=true username=admin password=**** I can connect to it successfully, but while executing any query (l...
Aditya Tiwari
1

votes
0

answer
198

Views

is it possible to query in memory arrow table using presto or is there some way to use a pandas data frame as a data source for presto query engine

is it possible to query in memory arrow table using presto or is there some way to use a pandas data frame as a data source for presto query engine ? Actually I have parquet files which I want to convert to arrow and query that thorough presto is something like this possible ?
Jaswinder
1

votes
0

answer
189

Views

Presto server crashing causes master node to restart

I am working on a 3 node presto cluster and trying to run tpch queries on 100GB data on hive-orc. Whenever I execute a query, firstly it tries to execute the query but after a few seconds it crashes and my master node restarts. I could not figure out the problem from the logs, as there are no ERROR...
Sangeeta
1

votes
0

answer
282

Views

How can I JSON.stringify my SELECT values in Presto SQL?

WITH DATA AS ( SELECT id, foo, bar FROM foobars WHERE id = 99 ) SELECT CONCAT('item_', CAST(id AS VARCHAR)) AS key, ARRAY_AGG( MAP( ARRAY[ 'id', 'foo', 'bar' ], ARRAY[ CAST(id AS VARCHAR), CAST(foo AS VARCHAR), CAST(bar AS VARCHAR) ] ) ) AS value FROM DATA; This formats two columns, key and value....
tester
1

votes
0

answer
75

Views

Compute confidence intervals in presto

I'm looking for a handy query to compute confidence intervals of a single column in presto. Is it possible to do this without a nested query?
user2020282
1

votes
1

answer
195

Views

Extracting domain name from referrer url using regex with Presto DB

I'm trying to extract the domain name from a list of referrer urls in PrestoDB. Using the url_extract_host function I have a list like below. I'm stuck trying to get the domain name out of the string. Presto uses java styled pattern syntax. I have a list of strings below, all of which should r...
Alexander
1

votes
1

answer
105

Views

Exception while build presto

I am geeting following exception while build presto, [ERROR] Failed to execute goal io.airlift.maven.plugins:sphinx-maven-plugin:2.0:generate (default) on project presto-docs: Failed to run the report: Sphinx report generation failed -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException...
1

votes
1

answer
89

Views

How to see each sub-query memory used in Presto UI

In presto-ip:port/ui/ site. When I go to -> Query Overview -> Live Plan, I see Stage 6 Output 371MB data(which it's a lot for me). How could I find out which part of the query it belong to? I click on Stage 6 and not showing usefull messaage.
Archon
1

votes
2

answer
772

Views

Convert two columns to key-value pair JSON string in Presto

Given this: XXX YYY ZZZ --- --- --- AAA PPP LLL AAA QQQ MMM AAA RRR NNN How do I convert it to this? XXX JSON --- ---- AAA {'PPP': 'LLL', 'QQQ': 'MMM', 'RRR': 'NNN'} FYI, I do not have access to row_to_json function in the database. Attempts have included: concatenating them as string (pretty hard...
Jerome Montino
1

votes
1

answer
116

Views

aws athena - cast as json don't return json object

I have a list of json objects (result attribute) as in the example : select result from mytable limit 1 I get : [{hop=1, error=null, result=[{x=null, from=192.168.0.1, rtt=0.378, ttl=64, err=null, ittl=null, edst=null, late=null, mtu=null, size=68, flags=null, dstoptsize=null, hbhoptsize=null, icm...
samara
1

votes
1

answer
32

Views

SQL Cross join to use max value for calculation in a case statement without group by

I have the following table: |BoreholeID|Mins| ----------------- |BH1 |0.5 | |BH1 |1 | |BH1 |1.5 | and i want to select a third column called timeline that has a case statement that returns either a 1 if the mins value is greater than 80% of the max mins value AND if the mins valu...
DBA108642
1

votes
1

answer
57

Views

What is the strftime config for an amazon athena timestamp

In python 3, I'd do something like this: '{0:Y-M-d H:m:?.???}'.format(datetime.datetime.now()) However, having searched a bit, it would be nice to have a canonical answer somewhere.
Nathan Feger
1

votes
1

answer
314

Views

sql query for creating map of array in aws athena (presto)

I have a table in aws athena with the following columns Company name Employee Name Salary ------------------------------------ Apple | John | 50 Apple | Dima | 100 Microsoft | Bart | 75 Google | Harry | 90 Google | Noah | 80 and I wan...
user1358729
1

votes
1

answer
349

Views

Unrecognized VM option 'ExitOnOutOfMemoryError' when starting Presto

I was following the installation manual for presto. But when I lanched the presto server from the command line I got this error: $ bin/launcher run Unrecognized VM option 'ExitOnOutOfMemoryError' Did you mean 'OnOutOfMemoryError='? Error: Could not create the Java Virtual Machine. Error: A fatal exc...
Dave
1

votes
1

answer
50

Views

Creating a dictionary of categoricals in SQL and aggregating them in Python

I have a rather 'cross platformed' question. I hope it is not too general. One of my tables, say customers, consists of my customer id's and their associated demographic information. Another table, say transaction, contains all purchases from the customers in the respective shops. I am interested in...
Nicolai Iversen
1

votes
0

answer
162

Views

PrestoDB - connecting to Oracle

The project I'm working on is using PrestoDB with Oracle and MongoDB. There is the need to insert a bulk of records (about 5000) in Oracle using the PrestoDB connection in Oracle. I don't want it to commit each line to have better performances, but if I set autocommit to false, it returns the except...
Stefania
1

votes
1

answer
35

Views

Can we configure presto's data base connector information from its GUI

I am using presto version 179 and I need to manually create a database.properties file in /etc/presto/catalog through the CLI. Can I do the same process from the GUI of presto?
UDIT JOSHI
1

votes
1

answer
63

Views

Connect to presto from java with .ppk key and run a simple query

I have been trying to connect to my EMR cluster from java code to run a presto query. Until now I created a 'maven project', and added 'presto dependancy' in the 'pom.xml'. I have been referring this link for the program https://gist.github.com/nagataka/2c2d9fa49b03e8556faf85345b43f59c I have two q...
Katukuri vamshi krishna
1

votes
0

answer
99

Views

PrestoDB: worker nodes disconnects continuously (No worker nodes available)

I am trying to setup a test PrestoDB cluster on 3 nodes (1 coordinator + 2 workers node) on Ubuntu 18.04 machines with 4GB RAM and 80GB HDD. The coordinator properties are as follows: node.properties: node.environment=test node.id=2259f48c-bd6a-11e8-bbdd-1a4f1f5bd394 node.data-dir=/opt/prestodata jv...
Shubham A.
1

votes
0

answer
464

Views

Explode an Array in Athena

I have a simple table in athena, it has an array of events. I want to write a simple select statement so that each event in array becomes a row. I tried explode, transform, but no luck. I have successfully done it in Spark and Hive. But this Athena is tricking me. Please advise DROP TABLE bi_data_la...
Manjesh
1

votes
0

answer
171

Views

how to setup an aws athena query with multiple regex replacements?

I have been trying to make an aws athena query and got enough work done to get my data. but my data need to identify some patterns and change it in an uniform way in order to agroup those 'similars'. So im trying ot make a regex_replacement, but how can i do multiple replacements to a same column in...
Daniel Vega
1

votes
0

answer
130

Views

Presto SQL Count Occurences In Array Column And Add Missing Timestamps

My Presto SQL statement aggregates multiple rows that all have a datetime into a single row with an array of those datetimes (all the other properties are the same among those rows). So I end up with something like this (there are an arbitrary number of columns, this is simplified): id | timestamps...
mhaken
1

votes
1

answer
833

Views

String to YYYY-MM-DD date format in Athena

So I've looked through documentation and previous answers on here, but can't seem to figure this out. I have a STRING that represents a date. A normal output looks as such: 2018-09-19 17:47:12 If I do this, I get it to return in this format 2018-09-19 17:47:12.000: SELECT date_parse(click_time,'%Y-%...
gooponyagrinch
1

votes
1

answer
147

Views

Casting String type to Unix Date Amazon Athena

I'm looking to get a result in Amazon Athena were I can count the quantity of users created by day (or maybe by month) But previous that I have to convert the unix timestamp to another date format. And this is where i fail. My last goal is to convert this type of timestamp: 1531888605109 In somethi...
jqc
1

votes
0

answer
146

Views

cast array<struct<key:string,value:array<string>>> into map<string,array<string>>

I have a table like name string one_key_value array kv.value)) AS one_key_value, MAP(TRANSFORM(two_key_value, kv -> kv.key), TRANSFORM(two_key_value, kv -> kv.value)) AS two_key_value FROM table_a; In hive I use SELECT name, map(k1,v1) AS one_key...
John Constantine
1

votes
1

answer
468

Views

Splitting an array into columns in Athena/Presto

I feel this should be simple, but I've struggled to find the right terminology, please bear with me. I have two columns, timestamp and voltages which is the array If I do a simple SELECT timestamp, voltages FROM table Then I'd get a result of: |timestamp | voltages | |1544435470 |3....
Daniel Crowley
1

votes
0

answer
116

Views

username & password with presto-python-client

I am trying to replace jaydebeapi with the presto-python-client by facebook the question is how to replace the authentication bit db = jaydebeapi.connect(connection['jclass'], connection['host'],[ connection['user'], connection['pass']], connection['jar']) while with presto-python-client import pre...
Hussein Negm
1

votes
1

answer
106

Views

Presto SQL window aggregate looking back x hours/minutes/seconds

I want to do aggregate on presto sql by looking back x hours/minutes/seconds ago. Data id | timestamp | status ------------------------------------------- A | 2018-01-01 03:00:00 | GOOD A | 2018-01-01 04:00:00 | BAD A | 2018-01-01 05:00:00 | GOOD A...
addicted
1

votes
0

answer
35

Views

How to extend a trendline to the x-intercept

I am working on a chart showing a burndown of task completion, and I have a trendline plotted against my data that's working nicely. Here is the trendline SQL query, done in Presto: SELECT ds, ds * REGR_SLOPE(COUNT(task_id), ds) OVER () ) + ( REGR_INTERCEPT(COUNT(task_id) AS DOUBLE, ds) OVER () ) A...
kathode
1

votes
0

answer
65

Views

Presto “Failed to list directory” when connecting to hive

In Presto, I tried to connect hive by configuring hive connectors as mentioned in https://prestodb.github.io/docs/current/connector/hive.html#configuration It works fine when I use Show Tables command. But When I used to fetch the data from table using select query SELECT * FROM sample_table. I got...
Clinton Prakash
1

votes
0

answer
50

Views

presto + how to manage presto servers stop/start/status action

we installed the follwing presto cluster on Linux redhat 7.2 version presto latest version - 0.216 1 presto coordinator 231 presto workers on each worker machine we can use the follwing command in order to verify the status /app/presto/presto-server-0.216/bin/launcher status Running as 61824 and al...
Judy

View additional questions