Questions tagged [druid]

1

votes
2

answer
280

Views

Druid with Kafka Ingestion: filtering data

is it possible to filter data by dimension value during ingestion from Kafka to Druid? e.g. Considering dimension: version, which might have values: v1, v2, v3 I would like to have only v2 loaded. I realize it can be done using Spark/Flink/Kafka Streams, but maybe there is an out-of-the-box solutio...
pcejrowski
0

votes
0

answer
3

Views

Writing Custom Extensions in Druid

I am new to Druid. Problem Statement We do currently push raw event data to Druid. I have a requirement to apply certain calculations on the data (say like certain stat techniques) which are not supported by Druid or the extensions it provides out of the box. There are two questions I have - What w...
Uno
1

votes
1

answer
810

Views

Java client with Apache HttpClient to connect to Druid

I am working on ingesting and query data on Druid Server. But, when I query I just using the command line as below: curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/ingest_statistic_hourly_generate.json localhost:8090/druid/indexer/v1/task Can anyone tell me the way of utilizing Jav...
VanThaoNguyen
1

votes
3

answer
613

Views

Registered lookup not working in Druid

I'm working with druid for a short time now and I'm testing the registered lookup functionality. I've already created the lookup under the http://:/druid/coordinator/v1/lookups, as following: { '__default': { 'home_post_code': { 'type': 'map', 'map': {'13210': 'Syracuse, NY'} } } } As far as I under...
Noel Ferreira
1

votes
2

answer
759

Views

Druid - Order data by timestamp column

I've set up a Druid cluster to ingest real-time data from Kafka. Question Does Druid support fetching data that's sorted by timestamp? For example, let's say I need to retrieve the latest 10 entries from a Datasource X. Can I do this by using a LimitSpec (in the Query JSON) that includes the timesta...
jithinpt
1

votes
1

answer
56

Views

Added io.druid dependency breaks Glassfish deployment

My problem looks similar to this one but I already use Glassfish 4.1.13 I try to add druid-client to my Glassfish project. I added druid-client as separate module. pom.xml of druid-client includes following Druid dependency: io.druid druid-server 0.9.1.1 Version of Glassfish: 4.1.13 Also I use maven...
Bo.
3

votes
4

answer
4.8k

Views

How to insert data into druid via tranquility

By following tutorial at http://druid.io/docs/latest/tutorials/tutorial-loading-streaming-data.html , I was able to insert data into druid via Kafka console Kafka console The spec file looks as following examples/indexing/wikipedia.spec [ { 'dataSchema' : { 'dataSource' : 'wikipedia', 'parser' : { '...
Cheok Yan Cheng
3

votes
2

answer
2.5k

Views

druid vs Elasticsearch

I'm new to druid. I've already read 'druid VS Elasticsearch', but I still don't know what druid is good at. Below is my problem: I have a solr cluster with 70 nodes. I have a very big table in solr which has 1 billion rows, and each row has 100 fields. The user will use different combinations range...
zhouxiang
2

votes
1

answer
1.5k

Views

How to apply hyperloglog to a timeseries stream

Can someone explain or link to an explanation about how counting the cardinality of a set with HLL can be used for time series analysis? I'm pretty sure druid.io does exactly this, but I'm looking for a general explanation of how to do this with HLL alone, without any specific library / database or...
Emmanuel Oga
6

votes
0

answer
114

Views

Apache Druid sql query conversion to json based query

I am trying to convert the following druid sql query to a druid json query, as one of the columns i have is a multi-value dimension for which druid does not support a sql style query. My sql query: SELECT date_dt, source, type_labels, COUNT(DISTINCT unique_p_hll) FROM 'test' WHERE type_labels = 'z'...
Pratik Khadloya
6

votes
1

answer
522

Views

Intersect two queries with different filters

I use Druid for monitoring events in my website. The data can be represented as follows: event_id | country | user_id | event_type ================================================ 1 | USA | id1 | visit 2 | USA | id2 | visit 1 | Canada...
orenMos
3

votes
1

answer
7.4k

Views

Java - java.lang.NoClassDefFoundError: com/google/inject/internal/util/$Preconditions

I'm working on an extension for druid that uses jclouds for Rackspace Cloud Files and I encountered a problem with Google guice and I'm not very confident with Java. I already saw this question, but it doesn't seem that there's a conflict in guice versions. This is the code that is being executed: @...
se7entyse7en
1

votes
1

answer
551

Views

How to format the TSV file in Druid

I am trying to load in a TSV in druid using this ingestion speck: MOST UPDATED SPEC BELOW: { 'type' : 'index'...
CapturedTree
2

votes
2

answer
1.1k

Views

configure Druid to connect to Zookeeper on port 5181

I'm running a MapR cluster and want to do some timeseries analysis with Druid. MapR uses a non-standard port for Zookeeper (port 5181 instead of the conventional port 2181). When I start the Druid coordinator service, it attempts to connect on the conventional Zookeeper port and fails: 2015-03-03T17...
Alex Woolford
2

votes
1

answer
90

Views

Data structure to store HashMap in Druid

I am newbie in Druid. My problem is that how to store and query HashMap in Druid using java to interact. I have network table as follow: Network f1 f1 f3 .... fn value 1 3 2 ..... 2 Additional, I have range-time table time impression 2016-08-10-00 1000 2016...
VanThaoNguyen
3

votes
1

answer
566

Views

List of supported data types for dimensions in Druid?

I cannot seem to find any particular tutorial/doc page on the Druid website which has a list of all supported data types in Druid for the dimensions. From how much I've read, I know that long, float and string are definitely supported, but I have next to zero information about the other supported ty...
Tarun Verma
2

votes
1

answer
627

Views

Druid: how to cache all historical node data in memory

I have about 10GB of data stored on a historical node. However the memory consumption for that node is about 2GB. When I launch a select query, results are returned the first time in more than 30 secondes. Next, they are in second (because of brokers cache). My concern is to reduce the first time se...
DrWho3
2

votes
2

answer
662

Views

Spark + Druid Tranquility - library version conflict

I get following error when I run a spark job with Druid Tranquility. java.lang.NoSuchFieldError: WRITE_DURATIONS_AS_TIMESTAMPS Druid Tranquility uses higher version of jackson-databind (2.6.1) than what is bundled in spark. I'm using latest stable versions of Druid Tranquility(0.6.4) and Spark(1.5....
Ashish
2

votes
4

answer
2.1k

Views

How realtime data input to Druid?

I have analytic server (for example click counter). I want to send data to druid using some api. How should I do that? Can I use it as replacement for google analytics?
Aryan
2

votes
2

answer
1k

Views

Query druid from java application

I'm new to druid. I want to query a remote druid cluster from my java application. I read in the druid-user google group that we can use io.druid.client.DirectDruidClient . Can someone please help me or point out a resource with an example for the same?
Priyanka
2

votes
1

answer
959

Views

How to perform a SELECT in the results returned from a GROUP BY Druid?

I am having a hard time converting this simple SQL Query below into Druid: SELECT country, city, Count(*) FROM people_data WHERE name='Mary' GROUP BY country, city; So I came up with this query so far: { 'queryType': 'groupBy', 'dataSource' : 'people_data', 'granularity': 'all', 'metric' : 'num_o...
CapturedTree
4

votes
1

answer
702

Views

Add a druid cluster as a SQL database in Apache Superset

I currently connect to the druid cluster through the druid connector in Apache Superset. I heard that SQL can be used to query druid. Is it possible to point my SQL database connection to druid?
5

votes
0

answer
521

Views

storm integration with druid class com.fasterxml.jackson.module.scala.ser.ScalaIteratorSerializer overrides final method withResolved error

I am new to both storm and druid. From the last few days, i am stuck on this issue. I am sending data from Kafka to storm and then to the druid. * I think druidbeambolt is receiving the data but unable to convert to it JSON before transferring to the druid. check my druidboltfactory code for more de...
gashu
2

votes
1

answer
674

Views

Druid for non time-series data

For cases where the daya get sent to Druid immediately as its generated, all is fine and dandy (as in IoT). Love it. But now I have different situation, stemming from late data-entry. The end-user can go offline (losing internet connection), and the data gets stored in her mobile phone, and only get...
Cokorda Raka
2

votes
2

answer
1k

Views

druid kafka indexing service setup

I followed the docs and edited: druid-0.9.2/conf/druid/_common/common.runtime.properties and added: 'druid-kafka-indexing-service' to the druid.extensions.loadList and restarted all druid services: middlemanager, overlord, coordinator, broker, historical I ran: curl -X 'POST' -H 'Content-Type:applic...
KillerSnail
2

votes
1

answer
233

Views

segmentGranularity in Druid indexing task; exact meaning & implication during indexing

I still don't quite get this 'segmentGranularity' in Druid. This page is quite ambiguous: http://druid.io/docs/latest/design/segments.html . It goes on mentioning segmentGranularity but it talks more about intervals (in the first paragraph). Anyway, at this point the volume of my data is not that bi...
Cokorda Raka
2

votes
1

answer
237

Views

Druid:how to add a numeric data to metric without aggregation function

The scenario is i want to setup a stock quote server and save the quote data into druid. my requirement is to get the latest price of all the stock by a query. But i notice that the query interface of druid such as time series only work on metrics filed ,not the dimension fields. so i consider to ma...
Yuezhi Liu
2

votes
0

answer
240

Views

Tranquility server would not send data to druid

I'm using imply-2.2.3. Here is my tranquility server configuration: { 'dataSources' : [ { 'spec' : { 'dataSchema' : { 'dataSource' : 'tutorial-tranquility-server', 'parser' : { 'type' : 'string', 'parseSpec' : { 'timestampSpec' : { 'column' : 'timestamp', 'format' : 'auto' }, 'dimensionsSpec' : { 'd...
Haonan Chen
2

votes
1

answer
165

Views

How do I increase number of workers in druid while using it through imply?

I'm running druid through Imply's setup and I wanna increase the number of druid workers but I don't know exactly where should I change the configuration of Imply to increase the number of druid's workers. Can anybody please help me for this?
Point Networks
8

votes
2

answer
2.4k

Views

Can druid replace hadoop?

Druid is used for both real time and batch processing. But can it totally replace hadoop? If not why? As in what is the advantage of hadoop over druid? I have read that druid is used along with hadoop. So can the use of Hadoop be avoided?
Amit Sharma
3

votes
1

answer
1.6k

Views

What are differences between Druid and ElasticSearch ? What are advantages for both?

I am pretty new with Druid and I don't get my answers regarding the comparison with ElasticSearch. I found this link: druid vs Elasticsearch but it does not give the differences and advantages. Can anyone explain me that or give me some links that I didn't find on google ? Thanks in advance. J
3

votes
1

answer
504

Views

Apache druid No known server

I am trying to setup the Apache Druid on a single machine following quickstart guide here. When I start historical server, it shows io.druid.java.util.common.IOE: No known server exception on screen. Command: java `cat conf-quickstart/druid/historical/jvm.config xargs` \ -cp 'conf-quickstart/druid/...
Rahul Sharma
2

votes
4

answer
1.6k

Views

Druid aggregate functions

I am using druid to create a UI for generating reports. For the scripting, I am using the following codes: { 'type' : 'doubleSum', 'name' : 'impressions', 'fieldName' : 'impressions' }, { 'type' : 'doubleSum', 'name' : 'clicks', 'fieldName' : 'clicks' }, { 'type' : 'doubleSum', 'name' : 'pvconversio...
Bitanshu Das
2

votes
1

answer
312

Views

java.nio.channels.UnresolvedAddressException when tranquility index data to druid

I am trying tranquility with Druid 0.11 and Kafka. When tranquility receive new data it throw the following exception: 2018-01-12 18:27:34,010 [Curator-ServiceCache-0] INFO c.m.c.s.net.finagle.DiscoResolver - Updating instances for service[firehose:druid:overlord:flow-018-0000-0000] to Set(ServiceI...
Sergio Jimenez
2

votes
3

answer
894

Views

Tranquility not sending data to Druid

I am evaluating Druid for my use case which ingest csv data through tranquility in real time. Following is the server configuration:- { 'dataSources' : { 'audience' : { 'spec' : { 'dataSchema' : { 'dataSource' : 'audience', 'parser' : { 'type' : 'string', 'parseSpec':{ 'format' : 'csv', 'timestampSp...
Mangat Rai Modi
2

votes
0

answer
172

Views

Subqueries in Druid - Cannot build plan for query

I'm trying to write a SQL query in Druid/Superset, which performs a 'GROUP BY' for two different time intervals (let's say last 7 days and last 30 days) on data from the same table and join both results. For example, the result of the first GROUP BY contains the columns 'col1, col2, col3, freq1' (fr...
biomartin
2

votes
1

answer
40

Views

Using nginx to redirect dynamic request

I have a druid service which runs at my local machine at port 8082 as follows: Method POST: http://localhost:8082/druid/v2/?pretty Body: { 'queryType' : 'topN', 'dataSource' : 'some_source', 'intervals' : ['2015-09-12/2015-09-13'], 'granularity' : 'all', 'dimension' : 'page', 'metric' : 'edits', 'th...
Juvenik
2

votes
0

answer
122

Views

Getting java.lang.NoSuchFieldError: INCLUDE_ALL when adding Druidry as a dependency to my Gradle project?

So I have been working on a dropwizard application. I had the controller working as I had wanted and now wanted to start implementing queries using Druid. I had wanted to use Druidry as a way to make queries through Java. My application works as desired without adding this dependency. However, simpl...
Naomi
4

votes
1

answer
252

Views

Hadoop and Druid incompatibility issue with Jackson library

I am running druid 0.9.0 on an Azure Cluster with HDP insight 2.4.1.1-3. The hadoop client is 2.7.1. After countless attempts to solve the issue with Jackson, specifically : Error: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize. I 've tri...
Stelios Savva
3

votes
1

answer
1.5k

Views

How to add Post Aggregation value fields as Metric in Druid io

I am using druid io 0.9.0. I am trying to add a post aggregation field as a metric spec. My Intention is to show the value of the post aggregation field similar to how a metric (measures) are shown (in Druid io using Pivot). My Druid io schema file is { 'dataSources' : { 'NPS1112' : { 'spec' : { '...

View additional questions