Questions tagged [parallel-processing]

0

votes
2

answer
50

Views

Can we improve performance on lists other than java 8 parallel streams

I have to dump data from somewhere by calling rest API which returns List. First i have to get some List object from one rest api. Now used parallel stream and gone through each item with forEach. Now on for each element i have to call some other api to get the data which returns again list and save...
Pavan
1

votes
2

answer
35

Views

How to handle API error in a foreach loop R?

FYI, based on some comments I added more information. I created the following function that is making a call to an API: keyword_checker
Franck
1

votes
1

answer
30

Views

Multiprocessing on a list being passed to a function

I have a function that processes one url at a time: def sanity(url): try: if 'media' in url[:10]: url = 'http://dummy.s3.amazonaws.com' + url req = urllib.request.Request(url, headers={'User-Agent' : 'Magic Browser'}) ret = urllib.request.urlopen(req) allurls.append(url) return 1 except (urllib.requ...
Eswar
1

votes
1

answer
39

Views

Does @everywhere not load a function on the master?

I made a module with an if condition on the number of cores. If the number of cores is more than 1 the route is parallel; otherwise, it goes the serial route as seen in the code below module mymodule import Pkg using Distributed if nworkers() > 1 @everywhere using Pkg @everywhere Pkg.activate('.') @...
0

votes
0

answer
6

Views

c# start same console application multiple times with different parameters at scheduled intervals

I have a console application that does few jobs, let's call them tasks. I want to be able to run each task individually or in parallel and one execution should not affect the other. Example if I have a console application CONSOLE A that accepts as parameter a string, an url and when is run, it get's...
user2818430
6

votes
0

answer
104

Views

How can I optimize parallel sorting to improve temporal performance?

I have an algorithm for parallel sorting a list of a given length: import Control.Parallel (par, pseq) import Data.Time.Clock (diffUTCTime, getCurrentTime) import System.Environment (getArgs) import System.Random (StdGen, getStdGen, randoms) parSort :: (Ord a) => [a] -> [a] parSort (x:xs) = force...
Vasiliy
1

votes
1

answer
300

Views

NumericMatrix not recognized as a type in RcppParallel Package

I'm learning to use RcppParallel to use in my work and was trying to install a simple package made with Rcpp.package.skeleton(). The package contains three source files, the Rcpp's HelloWorld (rcpp_hello_world.cpp) and the two versions of the matrix transformation functions found in the RcppParallel...
Alpalentless
1

votes
1

answer
36

Views

Assigning values to global environment variable using parallel sapply

I have recently started working with the parallel package in R and it is working wonders for me. Still, I have encountered an issue for which I have not found answer. I am trying to reformat some data and, to do so, I use sapply() or parSapply() in the parallel case. In the normal case, I go: sapply...
boski
1

votes
1

answer
48

Views

Openmp program works without critical section

In an Openmp lecture a similar code is shown as a race condition in Openmp. In the for loop the sum+= is not in a critical section, so the order the threads are executed change the result. But this is not the case in my program. No matter how often I run this program, the sum is always printed as 28...
relot
1

votes
1

answer
62

Views

Please help understanding Haskell Parallel

I have read some docs, using fibonacci as example. Then I started trying to parallelise my code mostly working with list. My code did not get any faster. Sample code: parMap :: (a -> b) -> [a] -> [b] parMap f = withStrategy (parList rseq) . map f parZipWith :: (a -> b -> c) -> [a] -> [b] -> [c] parZ...
Magicloud
0

votes
0

answer
12

Views

“'OSError: [Errno 23] Too many open files in system:' When importing functions on all engines”

I am trying to process a list of objects in parallel on a cluster using ipyparallel but I am having an error saying too many files open. I am using Jupyter notebook and can start 230 engines on the cluster. I am using Jupyter notebook and can starte 230 engines on the cluster. When trying to impor...
BND
1

votes
0

answer
76

Views

R parallel rocessing with for loop in function

I am building a parallel processing loop which has a for loop nestled inside. for simplicity sake I am only including one line of code within the function and for loop but in reality there are about 1000 lines of code. In this small example, I am building a grid for which I would like to iterate ov...
CooperBuck
1

votes
1

answer
342

Views

Auto Parallelization with VS

I am trying to understand how the auto-parallelization works to speed up the execution of a program I am writing. I have created a simpler example: #include #include #include using namespace std; using namespace std::chrono; class matrix { public: matrix(int size, double value) { A.resize(size,...
Mattia
0

votes
0

answer
2

Views

Maximum number of CUDA blocks?

I want to implement an algorithm in CUDA that takes an input of size N and uses N^2 threads to execute it (this is the way the particular algorithm words). I've been asked to make a program that can handle up to N = 2^10. I think for my system a given thread block can have up to 512 threads, but for...
gkeenley
1

votes
1

answer
73

Views

Create neighborhood list of large dataset / fasten up

I want to create a weight matrix based on distance. My code for the moment looks as follows and functions for a smaller sample of the data. However, with the large dataset (569424 individuals in 24077 locations) it doesn't go through. The problem arise at the nb2blocknb fuction. So my question would...
Kerstin
1

votes
1

answer
24

Views

Check whether processpool completed processing python3

How can i check whether process scheduled are completed in a process pool? i only have to execute rest of the code after finishing the process pool is there an way for this? executornan = concurrent.futures.ProcessPoolExecutor(20) for l, ch in enumerate(chunks): print('CHUNK NUMBEr', l) print...
chris1234
1

votes
0

answer
218

Views

What's the difference between Sequential and Synchronous Execution?

If I understand it correctly: Asynchronous Execution - One task doesn't have to wait for another to finish Concurrent Execution - Two tasks are being worked during a common time period (usually through context switching) But the opposites of both of those seem the same. Synchronous Execution - One...
master_of_privates
1

votes
0

answer
65

Views

Optimization: alternatives to passing large array to map in ipyparallel?

I originally wrote a nested for loop over a test 3D array in python. As I wanted to apply it to larger array which would take a lot more time, I decided to parallelise using ipyparallel by writing it as a function and using bview.map. This way I could take advantage of multiple cores/nodes on a supe...
mallowcodes
1

votes
0

answer
139

Views

Python Data Loading slow performance

I'm trying to read excel and bulk upload data to SQL Server Table. Data Loading works perfectly but it is taking longer than expected time. 28,000 record is taking 80 sec. I need to load 2-3 GB file in several occasions. I'm quite new to Python, Can you please take a look at the script and let me kn...
Partha
1

votes
0

answer
210

Views

Pragma omp parallel overhead

I have a problem with a #pragma omp parallel section in my code. I hava program which should sort a given array of integers with quicksort using multiple threads. For this in every step every thread gets assigned a portion of the array, partitions it and returns how many elements are smaller than a...
Anton
1

votes
0

answer
64

Views

Paralelism with Gunicorn, Ngnix and Celery

i am not sure how paralelism can be or should be combined between gunicorn and celery (and probably ngnix). 1) So first of all I use Nginx. 2) Secondly, i run gunicorn like this gunicorn -k gevent --worker-connections 1001 --bind=unix:myapp.sock -m 007 wsgi:application 3) Thirdly i run celery like t...
Laimonas Sutkus
1

votes
0

answer
125

Views

joblib - the parallel code takes more time than the non-parallel code

I am first time using joblib. I am using jupyter notebook on windows. it is 16 core machine. It seems that my code runs much slower when using joblib parallel processing compared with single process. I can see that joblib created processes to the do the job. But all of them used less than 2% CPU exc...
Karen Chen
1

votes
0

answer
46

Views

Linking between dask.distributed and LSF cluster

I'm using IBM's LSF platform to run my code in parallel. At the moment, this entails 'manually' breaking the code into a job array; instead of: for i in range(100): x[i] = f(i) I distribute f over 100 workers, and then 'manually' collect all their 100 different results to x. I'm trying to understan...
Adam Haber
1

votes
1

answer
208

Views

doParallel:::doParallelSNOW complains when foreach(…, .export) is specified

I'm curious what the design argument for doParallel:::doParallelSNOW() giving a warning when globals are explicitly specified in .export could be? For example, library('doParallel') registerDoParallel(cl
HenrikB
1

votes
0

answer
98

Views

Apache Spark: Running jobs in parallel in standalone mode

We are trying to get data from an Oracle database into Kinetica database through Apache Spark. We installed Spark in standalone mode. We executed the following commands. However, we have tried everything but we couldnt manage to run jobs in parallel. We use 2 IBM servers each of which has 128cores a...
O.Ekinci
1

votes
0

answer
54

Views
1

votes
1

answer
39

Views

How to have two nested Parfors iterating over two huge arrays in Matlab?

Considering two arrays, A=Huge_Arrau_one and B=Huge_array_two, how can I change the following code to work in Matlab (as Matlab doesn't accept nested loops). parfor (i,j) in all_combinations_of_A_and_B_indices A_in_this_worker = A(i); B_in_this_worker = B(j); .... end Please, note that I don't want...
CoderInNetwork
1

votes
0

answer
117

Views

Updating batch image array in-place when using joblib

This is a follow-up question for my solution to the question below: How to apply a function in parallel to multiple images in a numpy array? My suggested solution works fine if the function process_image() has to return the result and then we can cache that to some list for later processing. Since I...
kmario23
1

votes
2

answer
180

Views

Distributed Locking

I have 3 Process (P1, P2, and P3) each running on different machines. These process share 3 tables (T1, T2 and T3) in database. While updating these tables I need to maintain the atomicity of 3 tables at once ( Either all tables should be modified or none of the tables should be modified). My databa...
Kishor Bhandari
1

votes
0

answer
219

Views

C++ OpenMP: How to use private/protected member variables in a parallel region inside a function member?

TL;DR: I have a class member function in which there is some parallel code that uses other private or protected class members. My class structure is similar to this: class ChildClass : public GeneralClass { private: std::vector edgePotentials; protected: // Graph structure size_t numberOfNodes; s...
Khue
1

votes
1

answer
339

Views

NullPointerException when switch from stream to parallelStream

Help me understand this; I have a stream based logic that groups entities into a map-of-lists based on some key string that is constructed out of its fields. Using stream this runs without any error: Map mapOfkeyToListOfEntities = baseJournalEntries .stream() .collect(Collectors.groupingBy(eneity ->...
tbeernot
1

votes
0

answer
251

Views

Intel TBB: read a file, apply a function for each line and save the result to a vector

With intel TBB I'm trying to read a file `serial', apply a function on each line of the file, and save the result to a vector of a type A. struct A { long long time; double price; A(long long t, double p) : time(t),price(p){}; } I have created the following pipeline vector parallelFile(string fileNa...
david.t_92
1

votes
0

answer
45

Views

parallelize array whith tbb::paralel for

I´m new in IntelTBB and I am trying to parallelize an array of vector. This is the function, in which I pass the array as a pointer. void calculate(map &m, vector* ord, auto&k, string &dir){ for(int i=0;i
1

votes
1

answer
107

Views

Is it possible for parallel processing in Oauth authentication

I am trying to use multi thread for connecting CData drivers. Whether is it possible for parallel processing of data in CData. OdbcConnection conn = new OdbcConnection(); conn.ConnectionString = 'xxxx'; Task task1 = Task.Factory.StartNew(() => ReadData(conn)); Task task2 = Task.Factory.StartNew(()...
mRhNs13
1

votes
0

answer
25

Views

Efficient adf.test calculation in parallel environment

I'm working on a project where specific econometric tests are used, and the increasing timeframe window is applied. I've written a piece of code for linear calculations, and it takes about 20 seconds to get the results on my core i5 PC with 16M RAM installed. To speed up the calculations I’ve tri...
Anton
1

votes
1

answer
190

Views

Reading data from a large file

I have a text file which is as follows. 0.031 0.031 0.031 1.4998 0.9976 0.5668 0.9659 0.062 0.031 0.031 0.9620 0.7479 0.3674 0.4806 and so on...... This is a 32^3 grid which means there will be 32768 lines. In each line, there are 7 columns. I need to read each column and store it in separate 1D ar...
Ani
1

votes
2

answer
202

Views

Julia function timeout using async and remotecall_fetch fails to find function

I'm trying to kill execution of a function when it times out. Tried to leverage the post here: Julia: Can you set a time limit on eval It errored on RemoteRef is undefined (I'm using v0.6.0). Replaced RemoteRef with Channel(1). Now the error is MethodError: no method matching remotecall_fetch (::I...
Tims
1

votes
0

answer
185

Views

Tensorflow Benchmarks Input Pipeline Parallelize Image Processing?

I'm running the TF benchmarks and have read the document High-Performance Models, I have a question: In the document, it said Parallelize I/O Reads data_flow_ops.RecordInput is used to parallelize reading from disk. Given a list of input files representing TFRecords, RecordInput continuously reads r...
Cherie Huang
1

votes
0

answer
88

Views

Julia using Slurm invokes only one node

srun --nodes=3 hostname returns successfully all the 3 node names but srun --nodes=3 julia test.jl fails with error below where test.jl is given at the end here Worker 2 terminated. ERROR (unhandled task failure): Version read failed. Connection closed by peer. Stacktrace: [1] process_hdr(::TCPSocke...
Tims
1

votes
0

answer
338

Views

Python: What is the biggest difference between `Celery` lib and `Multiprocessing` lib in respect of parallel programming?

I think that all tasks that could be done using celery can also be done via multiprocessing library. Despite of this, wonder why the one use celery instead of multiprocessing in Python program or web framework such as django, flask, etc.
user3595632

View additional questions