Questions tagged [yarn]

0

votes
0

answer
2

Views

Rails 6 + yarn + datatables issue

I bumped into troubles with webpacker. I'm using Rails 6.beta3 and trying to add Datatables to my app. My steps: yarn add datatables.net-dt then in app/javascript/packs/application.js: require('@rails/ujs').start() require('turbolinks').start() require('@rails/activestorage').start() require('channe...
snake
1

votes
0

answer
1.4k

Views

GPU resource for hadoop 3.0 / yarn

I try to use Hadoop 3.0 GA release with gpu, but when I executed the below shell command, there is an error and not working with gpu. please check the below and just let you know the shell command. I guess that there are misconfigurations from me. 2018-01-09 15:04:49,256 INFO [main] distributedshe...
Kangrok Lee
1

votes
1

answer
213

Views

Hive on tez in EMR schedule tasks very slow

I'm trying to use Hive on tez to query orc format data stored in S3. Tez AM scheduled tasks very slow, a lot of Map tasks remained in 'PENDING' for a long time. There were enough resources in the cluster (quite enough I would say. There were more than 6TB memory and more than 1 thousand vcores avail...
Harper
1

votes
0

answer
159

Views

Spark cluster managed by Yarn throws java.lang.ClassNotFoundException

I am new to Yarn but have some experience with Spark stand alone master, I have recently installed a Yarn+Spark cluster using Ambari. I have a spark program (program.jar) compiled to jar which relies on another jar to work (infra.jar). I set the following configurations in Ambari: spark.executor.ext...
Anton.P
1

votes
0

answer
179

Views

Spark streaming from Kafka returns result on local but not on YARN

I am using Cloudera's VM CDH 5.12, spark v1.6, kafka(installed by yum) v0.10 and python 2.66. I am following this link spark settings: Below is a simple spark application that I am running. It takes events from kafka and prints it after map reduce. from __future__ import print_function import sys fr...
Samhash
1

votes
0

answer
205

Views

Yarn / IIS URL rewrite

I have a React app deployed to production as a bundle.js and index.html pair, served up by IIS under a url like: https://my-app.com This talks to a .Net WebAPI backend, served up by IIS as a virtual application at https://my-app.com/api This allows the React app to make requests to the relative url...
Alex McMillan
1

votes
1

answer
89

Views

Apache Spark circuit breaker

Using Apache Spark 1.6.2 in Hadoop YARN cluster. Some (simple) queries can consume lot of resources, I see our developers running SELECT * FROM DB against a 1To file! Hence, it takes a long time and 'block' all YARN resources for a moment (and crash most of the time after a few hours ...). I am wond...
Thomas Decaux
1

votes
0

answer
709

Views

flink on yarn - Could not initialize class org.apache.hadoop.fs.s3a.S3AFileSystem

I'm trying to run a Flink job on yarn: $ export HADOOP_CONF_DIR=/etc/hadoop/conf $ ./${FLINK_HOME}/bin/yarn-session.sh ... $ ./${FLINK_HOME}/bin/flink run \ /home/clsadmin/messagehub-to-s3-1.0-SNAPSHOT.jar \ --kafka-brokers ${KAFKA_BROKERS} \ ... --output-folder s3://${S3_BUCKET}/${S3_FOLDER} The st...
Chris Snow
1

votes
1

answer
10

Views

How to create react-app using directly only docker instead of host?

I am creating new Reactjs application using Docker and I want to create new instance without installing Node.js to host system. I have seen many of tutorials but everytime first step was to install Node.js to the host, init app and then setup Docker. Problem I ran into was the official Node.je Docke...
mimros
1

votes
0

answer
238

Views

How to deploy Spark2 Job via Yarn REST API

Although there are some examples (and questions) on how to submit Spark-Jobs via the YARN-REST-API, there are none, that address the specific changes required to make it work with Spark2. I'm currently basing my work off this example and accompanying documentation, but one thing already is quite cle...
Rick Moritz
1

votes
0

answer
30

Views

Does exist a configuration for webpack in which file naming is incremental instead of using an hash function?

I generated a web app with create-react-app and after the build the result is something like: build\static\js\main.52990f3a.js build\static\css\main.eeea279c.css If I made some changes and run the build again I usually get a result similar but with a different hash. Is it possible to change this na...
michoprogrammer
1

votes
0

answer
33

Views

why does hbase need nodemanager when it uses coprocessors

Node Manager is used to start, execute and monitor containers on YARN(containers are assigned to execute map-red jobs). Co-processor on the other hand is a framework which does distributed computation directly within the HBase server processes. I have tables in HBase which I query using phoenix. My...
Aditya
1

votes
0

answer
321

Views

Configuring Apache Flink Local Set up with Pseudo distributed Yarn

I have set up HADOOP 2.8.3 and YARN on my Mac machine in Pseudo distributed mode, following this blog: Hadoop in Pseudo Distributed Mode I have been able to successfully start hdfs and yarn. And also able to submit Map reduce jobs. After that I have download Apache Flink 1.4 from here. Now I am tryi...
Karrtik Iyer
1

votes
0

answer
168

Views

How to set yarn.nodemanager.resource.cpu-vcores (number of virtual cores)

I am little confuse What is the yarn.nodemanager.resource.cpu-vcores value that should be ? ( based on the picture down ...) real total CPU CORE on worker machine? OR set the value to 80% of real total CPU CORE on worker machine? ( as some sites recommended ) for now I set it to 8 ( on each worker...
enodmilvado
1

votes
0

answer
95

Views

spark cluster using yarn and fair scheduler hits maximum cores per user

I am making multiple yarn job submissions into a cluster using the fair scheduler and dynamic allocation. I can load up many jobs in the queue. What I see is that the jobs will run up to 18 worker cores, then it will hold off the jobs downstream, until one of the active cores is finished. Where woul...
bhomass
1

votes
0

answer
568

Views

What should I do if an executor node suddenly dies in spark-streaming?

I am using version 1.6 of spark streaming. A few days ago, my spark streaming app(context) suddenly shutdown. Looking at the log, one of the executors seems to be shutdown. (The equipment was actually turned off.) What should I do in case this happens? (Note that the dynamic allocation option is not...
Dogil
1

votes
0

answer
116

Views

HiBench wordcount job hangs on hadoop 2.9

I am using: HiBench 7.0 Hadoop 2.9 Java version 1.8.0_161 Scala code runner version 2.11.6 Apache Maven 3.5.2 All on a three-node Hadoop cluster of OpenStack VM's with details: Ubuntu 16.04.3 LTS VCPUs: 8 RAM: 16GB Size: 10GB Each has a 100GB volume attached where the dfs storage is kept Wh...
Diego Delgado
1

votes
1

answer
13

Views

How to fix packaging dependency on react-gauge-chart

I am trying to find an packaging issue caused by a package called react-gauge-chart. After installed, the error 'module not found' was shown. I was wondering the reason behind this issue. This issue is happening on both my local as well in codesandbox example. Here is the error from my local: Failed...
Wen Yao
1

votes
0

answer
37

Views

Unhandled error event occurred while using ttf font file in React JS

I've been using this MYFONT.ttf in my React Project for a long time. But suddenly it's throwing the following error when I try to yarn start the project. Starting the development server... events.js:182 throw er; // Unhandled 'error' event ^ Error: watch /home/kbtganesh/Documents/Pure/fe-home-loan/p...
Ganesh
1

votes
0

answer
173

Views

How can starting time of a Spark application on a YARN cluster be shorten?

I'm managing a system that submits a Spark application to a YARN cluster in a client mode. It takes about 30 seconds for a submitted Spark application to turn to be ready on a YARN cluster and it seems a bit slow for me. Are there any ways for shortening starting time of a Spark application on a YAR...
Keiji Yoshida
1

votes
0

answer
231

Views

Process a 1/2 billion rows with PySpark creates shuffle read problems

I am apparently facing a read shuffle problems. My Pyspark Script is running on a Hadoop cluster 1 EdgeNode and 12 Datanodes, using YARN as resources manager and Spark 1.6.2. ###[ini_file containing conf spark] spark.app.name = MY_PYSPARK_APP spark.master = yarn-client spark.yarn.queue = agr_queue s...
Indy McCarthy
1

votes
0

answer
332

Views

Hadoop / Spark: kill a task and do not retry

I would like to know if there is a way to stop a job (and have it in a FAILED or KILLED state) when I detect something wrong within a map or a reduce task without Hadoop retries the task. If possible I would like to keep the fact that on 'normal' fails Yarn restart the task. Currently I am throwing...
Benjamin
1

votes
1

answer
206

Views

Default queue under YARN capacity policy

Using the following queue configuration under YARN capacity policy, how is the default queue chosen when no queue is specified at job launch? yarn.scheduler.capacity.root.queues prod,dev yarn.scheduler.capacity.root.dev.queues eng,science I know that under the fair policy, you can choose a default q...
JaviOverflow
1

votes
0

answer
47

Views

How to reserve resource for a certain task in spark?

I have two kinds of tasks in spark : A and B In spark.scheduler.pool, I have two pools: APool and BPool. I want task A to be executed aways in APool while B is in BPool. The resources in APool is preserved to A. Because task B may take too much resources to execute. Every time when B is executing,...
HalfLegend
1

votes
0

answer
142

Views

Spark Job creates too much tasks

I am developing a code in Scala to launch into a Cloudera cluster. My code is: def func_segment (model: String) : String = { if(model == 'A1' || model == 'B1' || model == 'C1' || model == 'D1') 'NAME1' else if (model == 'A2' || model == 'B2') 'NAME2' else 'NAME3' } val func_segment_udf = udf((model...
1

votes
2

answer
586

Views

How to change java.io.tmpdir for spark job running on yarn

How can I change java.io.tmpdir folder for my Hadoop 3 Cluster running on YARN? By default it gets something like /tmp/***, but my /tmp filesystem is to small for everythingYARN Job will write there. Is there a way to change it ? I have also set hadoop.tmp.dir in core-site.xml, but it looks like...
smikesh
1

votes
1

answer
137

Views

Hadoop Multinode Cluster : Connection failed with slave node

I'm trying to use my Hadoop Multinode Cluster : 1 Namenode (master) 2 Datanodes (slave1 & slave2) I would like to make some tests with MapReduce but I'm getting an issue and I don't find anywhere to solve this one. I uploaded to my HDFS a file called data.txt I created both files : mapper.py and red...
Essex
1

votes
0

answer
454

Views

How to configure yarn workspaces

I configured our new project using yarn workspaces as below MyProject - native - shared - web shared is a dependency for both native and web. As we're using mobx and decorators, I configured transform-decorators-legacy in .babelrc of both shared and web folders. Now I'm trying to run npm start in we...
Anesh
1

votes
0

answer
149

Views

Hadoop: No logs available for container “only in one cluster”

I have hadoop on 3 cluster master and two slave. slave1 and slave2 After running a yarn job I can access logs only from the second slave. When I try to access the logs of slave1 I get No logs available for container container_1521590274787_0002_01_000001 even if the code was totally executed by sl...
1

votes
0

answer
482

Views

Spark: Connection refused webapp proxy on yarn

I am using spark and hadoop on docker container: I have 3 container master and 2 slave. Everything is working properly but I have problem with spark proxy webapp when running a task. I can connect to yarn webapp though nhttp://172.20.0.2:8088/ I can also acces the nodes with http://172.20.0.3:8042/...
1

votes
0

answer
997

Views

Could not find CoarseGrainedScheduler or it has been stopped

Recently I strated working on Talend Bigdata. I have written some jobs that is getting executed properly. Data to process in every run is less than a gb but it's .gz file. Job is running succesful but I am facing the below error frequently although when reprocessing the job without any changes then...
Varun
1

votes
0

answer
41

Views

Is it normal Hadoop to be 60 times slower than local computations?

I am doing machine learning tasks on hadoop (regard forward execution) cluster with dozens of nodes. In case of dataset sizes like 10k samples per 300 features, Hadoop processing (running common models like shallow neural nets or decision trees) can take up to 6 hours. Simultaneously, the same or si...
Dims
1

votes
0

answer
47

Views

How to get the scheduler of an already finished job in Yarn Hadoop?

So I'm in this situation, where I'm modifying the mapred-site.xml and specific configuration files of different schedulers for Hadoop, and I just want to make sure that the modifications I have made to the default scheduler(FIFO), has actually taken place. How can I check the scheduler applied to a...
mani
1

votes
0

answer
293

Views

Spark jobs are not running in the specified queue

I am running spark job written in Scala. val conf = new SparkConf().setAppName('BigDataSparkInitialPoc') conf.set('spark.yarn.queue', 'Hive') val sc = new SparkContext(conf) The above code is not submitting my code into the queue called 'Hive'. Instead the job is running in the default queue I check...
TomG
1

votes
1

answer
149

Views

Possible to start Dask in yarn-client mode?

I use dask_yarn (part of knit) to start a Dask Yarn cluster as follows: import dask_yarn cluster = dask_yarn.DaskYARNCluster(env='/home/hadoop/reqs/dvss.zip', lang='en_US.UTF-8') cluster.start(n_workers=4, memory=5120, cpus=3) This requests 1 vCore on core nodes for AM, and gives the rest of the vCo...
j-bennet
1

votes
0

answer
187

Views

Spark submit a parallel job

( have a problem with Apache Spark I have a cluster with 10 nodes (1 master and 9 slaves), each node has 1048MB of memory. I work in machine learning, so I'd like to run my implementation in parallel, but I cannot make it work - there is always a single Worker that executes the application I submit....
Yacine Mohammed
1

votes
1

answer
134

Views

Container is running beyond virtual memory limits. . Killing container

Current setup mysql connector version-mysql-connector-java-5.1.13 sqoop version-sqoop-1.4.6 hadoop version-hadoop-2.7.3 java version- Jdk-8u171-linux-x64/jdk1.8.0_171(oracle JDK) OS-Ubundu Note: Also tried with openjdk , same issue exist with this version also Sqoop Command : bin/sqoop import -conne...
1

votes
1

answer
203

Views

Connecting Kerberos + SSL enabled solr in spark job under yarn

I have SOLR 6 cluster which is Kerberos and SSL enabled. When i connect to it with a test client with CloudSolrClient it works fine. But the same code when run it in spark job driver I get below check sum failed Error. I checked all the mentioned issues related checksum like reverse dns lookup and...
avinash patil
1

votes
1

answer
80

Views

Monitoring and checking status of YARN

How can I access to YARN metrics such as status of resource manager and node manager? Moreover, the same question about running yarn containers. I would like to do it via web interface.
CypherFancy
1

votes
1

answer
108

Views

Jenkins grunt compass ENOENT No such file or directory @ realpath_rec

I am working on an existing project to replace bower with yarn and upgrading angularjs from 1.2.9 to 1.3.0 I've got it working on my local system but it fails on jenkins when running deploy grunt task with a filepath issue, the weird thing is on jenkins it complains with my local path Errno::ENOENT...
Subash

View additional questions