# Numpy ordered subtraction of elements with full vectorization

 Refresh February 2019 Views 31 time
1

Imagine a `mxn` array `a`, and a `1xn` array `b`, we want to subtract `b` from `a` so that `b` is subtracted from the first element of `a`, then maximum of zero and `b-a[0]` is subtracted from `a[1]`, and so on...

So:

``````x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
a = np.repeat(x, 100000).reshape(10, 100000)
b = np.repeat(np.array([5]), 100000).reshape(1, 100000)
``````

So we want to get: `[ 0, 0, 1, 4, 5, 6, 7, 8, 9, 10]`, repeated 100,000 times.

I've managed the function below which provides the desired outcome:

``````def func(a, b):
n = np.copy(a)
m = np.copy(b)
for i in range(len(n)):
n[i] = np.where(n[i] >= m, n[i] - m, 0)
m = np.maximum(0, m - a[i])
if not m.any():
return n
return n
``````

However, it's not fully vectorized. So:

``````>> timeit func(a, b)
3.23 ms ± 52.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
``````

Ideally, I'd like to get rid of the for-loop and make this as vectorized as possible.

Thanks.

0

I think you can vectorize your function like this:

``````import numpy as np

def func_vec(a, b):
ar = np.roll(a, 1, axis=0)
ar[0] = 0
ac = np.cumsum(ar, axis=0)
bc = np.maximum(b - ac, 0)
return np.maximum(a - bc, 0)
``````

Quick test:

``````import numpy as np

def func(a, b):
n = np.copy(a)
m = np.copy(b)
for i in range(len(n)):
n[i] = np.where(n[i] >= m, n[i] - m, 0)
m = np.maximum(0, m - a[i])
if not m.any():
return n
return n

np.random.seed(100)
n = 100000
m = 10
num_max = 100
a = np.random.randint(num_max, size=(m, n))
b = np.random.randint(num_max, size=(1, n))
print(np.all(func(a, b) == func_vec(a, b)))
# True
``````

However, your algorithm has an important advantage over the vectorized algorithm, which is that it stops the iteration when it finds that there is nothing else to subtract. This means that, depending on the size of the problem and the specific values (which is what determines when does the early stop happens, if at all), the vectorized solution may actually be slower. See for the above example:

``````%timeit func(a, b)
# 5.09 ms ± 78.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit func_vec(a, b)
# 12.4 ms ± 939 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
``````

You can, however, get something of a "best of both worlds" solution using Numba:

``````import numpy as np
import numba as nb

@nb.njit
def func_nb(a, b):
n = np.copy(a)
m = np.copy(b)
zero = np.array(0, dtype=a.dtype)
for i in range(len(n)):
n[i] = np.maximum(n[i] - m, zero)
m = np.maximum(zero, m - a[i])
if not m.any():
return n
return n

print(np.all(func(a, b) == func_nb(a, b)))
# True
%timeit func_nb(a, b)
# 3.36 ms ± 461 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
``````