# Speed up minimum search in Numpy/Python - Python

TAGS :
Viewed: 7 - Published at: a few seconds ago

#### [ Speed up minimum search in Numpy/Python ]

I have two floating arrays and want to find data points which match within a certain range. This is what I got so far:

``````import numpy as np

for vx in range(len(arr1)):
match = (np.abs(arr2-arr1[vx])).argmin()
if abs(arr1[vx]-arr2[match])&lt;0.375:
point = arr2[match]
``````

The problem is that arr1 contains 150000 elements and arr2 around 110000 elements. This takes an awful amount of time. Do you have suggestions to speed things up?

# Answer 1

In addition to not being vectorized, your current search is (n * m) where n is the size of arr2 and m is the size of arr1. In these kinds of searches it helps to sort arr1 or arr2 so you can use a binary search. Sorting ends up being the slowest step but it's still faster if m is large because the n*log(n) sort is faster than (n*m).

Here is how you can do the search in a vectorized way using the sorted array:

``````def find_closest(A, target):
#A must be sorted
idx = A.searchsorted(target)
idx = np.clip(idx, 1, len(A)-1)
left = A[idx-1]
right = A[idx]
idx -= target - left < right - target
return A[idx]

arr2.sort()
closest = find_closest(arr2, arr1)
closest = np.where(abs(closest - arr1) < .375, closest, np.nan)
``````

# Answer 2

The whole idea of using numpy is to avoid computation with loops.

Specifying criteria to extract a new array that satisfies the criteria can be implemented easily with array computation. Here's an example extracting values from array `a` which satisfies the criteria that that element has an absolute different of less than 0.75 from the corresponding element in array `b`:-

``````a = array([1, 0, 0.5, 1.2])

b = array([1.2, 1.1, 1.3, 1.4])

c = a[abs(a-b)<0.75]
``````

Which gives us

``````array([ 1. ,  1.2])
``````