Can we improve performance on lists other than java 8 parallel streams

Refresh

4 weeks ago

Views

50 time

0

I have to dump data from somewhere by calling rest API which returns List.

  1. First i have to get some List object from one rest api. Now used parallel stream and gone through each item with forEach.

  2. Now on for each element i have to call some other api to get the data which returns again list and save the same list by calling another rest api.

  3. This is taking around 1 Hour for 6000 records of step 1.

I tried like below:

restApiMethodWhichReturns6000Records
    .parallelStream().forEach(id ->{
       anotherMethodWhichgetsSomeDataAndPostsToOtherRestCall(id);
                       });


public void anotherMethodWhichgetsSomeDataAndPostsToOtherRestCall(String id) {

     sestApiToPostData(url,methodThatGetsListOfData(id));
}

2 answers

0

IF the API calls are blocking, even when you run them in parallel, you will be able to do just a few calls in parallel.

I would try out a solution using CompletableFuture.

The code would be something like this:

List<CompletableFuture>> apiCallsFutures = restApiMethodWhichReturns6000Records
    .stream()
    .map(id -> CompletableFuture.supplyAsync(() -> getListOfData(id))    // Mapping the get list of data call to a Completable Future
                                 .thenApply(listOfData -> callAPItoPOSTData(url, listOfData))   // when the get list call is complete, the post call can be performed 
    .collect(Collectors.toList());

CompletableFuture[] completableFutures = apiCallsFutures.toArray(new CompletableFuture[apiCallsFutures.size()]); // CompletableFuture.allOf accepts only arrays :(

CompletableFuture<Void> all = CompletableFuture.allOf(completableFutures); // Combine all the futures

all.get(); // perform calls

For more details about CompletableFutures, have a look over: https://www.baeldung.com/java-completablefuture

2

parallelStream can cause unexpected behavior some times. It uses a common ForkJoinPool. So if you have parallel streams somewhere else in the code, it may have a blocking nature for long running tasks. Even in the same stream if some tasks are time taking, all the worker threads will be blocked.

A good discussion on this stackoverflow. Here you see some tricks to assign task specific ForkJoinPool.

First of all make sure your REST service is non-blocking.

One more thing you can do is to play with pool size by supplying -Djava.util.concurrent.ForkJoinPool.common.parallelism=4 to JVM.