# Questions tagged [nearest-neighbor]

410 questions

2

votes

0

answer

19

Views

### What's the most optimal query in PgSql to find the nearest neighbor in a huge dataset?

I have a huge table (about 40 Millions rows) called nearest_spot representing lines (in the linestring format) and the closest spot they're to (there is about 1500 different spot, stored in another table). The nearest_spot table comes like this :
data_id || spot_id || spot_name || link_geom
Where d...

1

votes

1

answer

371

Views

### opencv FLANN radiusSearch give bad results

I'd like to do radius search to find all valid neighbors, but it seems to give me wrong results. Here is my code
#include 'opencv/cv.hpp'
#include
#include
int main () {
// create a group of points
std::vector points;
points.emplace_back(438.6, 268.8);
points.emplace_back(439.1, 268.6);
points.emp...

1

votes

1

answer

284

Views

### Providing user defined sample weights for knn classifier in scikit-learn

I am using the scikit-learn KNeighborsClassifier for classification on a dataset with 4 output classes. The following is the code that I am using:
knn = neighbors.KNeighborsClassifier(n_neighbors=7, weights='distance', algorithm='auto', leaf_size=30, p=1, metric='minkowski')
The model works correctl...

1

votes

0

answer

29

Views

### SOM kmean optimization ValueError: all the input arrays must have same number of dimensions

I am trying to merge kmeans into SOM finding the best match unit. During clustering points to return the numbers of clusters for each point I encounter this error
'ValueError: all the input arrays must have same number of dimensions'
in line 159
distances_from_center = np.concatenate((distances_fr...

1

votes

0

answer

73

Views

### Nearest Neighbors algorithm by attribute importance

I want to know what solutions are provided within Python libraries or Tensorflow libraries for finding nearest neighbor by attribute importance.
In KDTree library attributes are treated equally with same importance.
# KDTree
import numpy as np
from sklearn.neighbors import KDTree
# Neighbors and...

1

votes

1

answer

474

Views

### Implementing KNN with different distance metrics using R

I am working on a dataset in order to compare the effect of different distance metrics. I am using the KNN algorithm.
The KNN algorithm in R uses the Euclidian distance by default. So I wrote my own one. I would like to find the number of correct class label matches between the nearest neighbor and...

1

votes

0

answer

109

Views

### Nearest Neighbor matching with replacement Python

I have 2 dataframes, df_test and df_control. For each element in df_test, I am looking for the closest match in df_control based on a feature_list.
I have seen the NearestNeighbors function in scikit-learn (also this answer). However, this function does not have an option for sampling without replac...

1

votes

1

answer

59

Views

### Custom metric for NearestNeighbors sklearn

Hello I'm on a project where I use 512 bits hash to create clusters. I'm using a custom metric bitwise hamming distance. But when I compare two hash with this function I obtain different distance results than using the NearestNeighbors.
Extending this to DBSCAN, using a eps=5, the cluster are creat...

1

votes

1

answer

271

Views

### KNeighborsClassifier change of distance

I would like to change the distance used by KNeighborsClassifier from sklearn. By distance I mean the one in the feature space to see who are the neighbors of a point. More specifically, I want to use the following distance:
d(X1,X2) = 0.1 * |X1[0] - X2[0]| + 0.9*|X1[1] - X2[1]|
Thank you.

1

votes

1

answer

75

Views

### How can I get the nearset point of Delaunay Triangle in Python?

In Matlab,I can use function DelaunayTri and nearestNeighbors to find the nearest point. The code like this:
X1=[1,2,3,4,5,6,7]';
Y1=[1.3,1.5,1.7,1.9,2.1,2.3,2.5]';
Triangulation=DelaunayTri(Y1, X1);
X2=[1.5,2.5,3.5,4.5,5.5]';
Y2=[1.2,2.2,3.2,4.2,5.2]';
NearInd = nearestNeighbor(Triangulation, Y2, X...

1

votes

0

answer

32

Views

### Turi Create - stuck at “Starting brute force nearest neighbors model training”

I’m following Turi Create tutorial on Image Similarity
https://apple.github.io/turicreate/docs/userguide/image_similarity/
I got stuck at:
Starting brute force nearest neighbors model training.
In [ ]:
Its been 14 hours already, tried 2 times and it always get stuck at this point.
Any clue what is...

1

votes

1

answer

99

Views

### How to find the nearest point among list of GPS coordinates

I have 2 sets of data. One with coordinates points of different vehicles (vehicledata) and another one with coordinate points that are each assigned a region (sublap data). I want to classify the vehicles by region. However, there are some vehicle coordinates that are not on the region data. I would...

1

votes

2

answer

182

Views

### Python kdtree find “n” nearest neighboring groups (of coordinates)

Objective: Given a coordinate X, find 'n' nearest line-polygon for coordinate X, not just 'n' nearest points. Example: https://i.imgur.com/qyxV2MF.png
I have a group of spatial line-polygons which can have more than 2 coordinates. Their coordinates are stored in a (scipy)KDtree to enable NN search....

1

votes

0

answer

25

Views

### nanoflann two points equidistance from center

I'm using nanoflann to find the k nearest neighbors (e.g., the k closest points to x). Say, I find k points and define r to be the smallest radius that contains all k points. I then add a new point y such that y lies on the edge of the ball. When I again find the k closest points to x, y is NOT i...

1

votes

0

answer

39

Views

### Find the closest or the nearest path in sql

I have a table(table1) consists of 5 columns in this order
date | x | y | z | t
... | 10.24 | 12.01 | 6 | 7
... | 42 | 18 | 12 | 1
and ....
This table has 5 million records
I have another table(table2) like table1 structure but it has only 10 rows
I...

1

votes

0

answer

21

Views

### Conditional mask considering neighborhood in Python

I want to binarize an 2D numpy array by using a gray level condition, but considering a 4 connectivity neighborhood.
Guess the mask could be something like
0 1 0
1 1 1
0 1 0
In other words, I don't want isolated pixels to be extracted from the image, since I can do this simply by using numpy.where...

1

votes

0

answer

23

Views

### Finding nearest station index on the grid from given data points in lat/lon on a grid in Python

I have 30000 data points on a grid and 4000 data points of stations of which, data is known. I need to find out which station index is closest to all the 30000 grid point. Please note that the locations of all the points are in lat/lon and I am writing the code in Python.
P.S.- I think, calculating...

1

votes

1

answer

23

Views

### how to find all numbers that there distance from given point are less or equall to integer n

given a set of points D and some number K I want to find all numbers that are in D such that the distance between K and any found number is less or equal to integer N?
Example:
suppose we have D={5,9,0,6,7} and K=8 and N=1 then the result should be {9,7}
I was thinking to use k-d tree or VP tree but...

1

votes

0

answer

84

Views

### how to search in vantage point trees?

I am trying to understand vantage point trees and how to use them so in order to do that I created simple example and tried to solve it using vantage point trees and here it is:
suppose we have S={5,0,6.9,7} and we want to execute q=8 (search for the nearest neighbor to q in S) how to do that?
My so...

1

votes

0

answer

16

Views

### Finding the nearest grid box to a point in MATLAB

I'm using MATLAB to to analyze model data.
Right now, I'm dealing with latitude, longitude, and altitude data. Both latitude and longitude are matrices of 336 x 264 and their output is in units of degrees. My altitude data is 3 dimensional. The first two dimensions are latitude and longitude and th...

1

votes

2

answer

507

Views

### How to match Tagged items based on “similarity”

I have a real question.
I have a database with the schema as follows:
item
id
description
other junk
tag
id
name
item2tag
item_id
tag_id
count
Basically, each item is tagged as up to 10 things, with varying counts. There are 50,000 items and 50,000 tags, and about 500,000 entries in items2tag. I'...

1

votes

2

answer

1k

Views

### Finding elements within distance k from a matrix

Given a n*n matrix and a value k, how do we find all the neighbors for each element?
for example: in a 4*4 matrix, with k=2
say matrix is :
[ 1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16]
where these values are the indexes of the location, the neighbors for 1 are 1,2,3,5,6,9 . The values 3,6 and 9...

1

votes

1

answer

389

Views

### How to perform a multidimensional search for “N-nearest neighbors?”

I am designing an automated trading software for the foreign exchange market.
In a MYSQL database I have years of market data at five-minute intervals. I have 5 different metrics for this data alongside the price and time.
[Time|Price|M1|M2|M3|M4|M5]
x ~400,0000
Time is the primary key, and M1 thr...

1

votes

2

answer

1.9k

Views

### K-Nearest Neighbors and MySql Geographical Index

I have a set of geo-tagged pictures in mySql database. You can consider my Pictures table to be:
create table `Pictures` (
location Point NOT NULL,
timeCreated timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
SPATIAL INDEX(location)
)ENGINE= MyISAM DEFAULT CHARSET=utf8;
I intend to perform a K-Nearest...

1

votes

3

answer

6k

Views

### KD Tree - Nearest Neighbor Algorithm

I'm not quite understanding the O(log n) nearest neighbor algorithm from wikipedia.
…
…
The algorithm unwinds the recursion of the tree, performing the following steps at each node:
...
The algorithm checks whether there could be any points on the other side of the splitting plane that are close...

1

votes

2

answer

283

Views

### Multithreading a read-only function of a tree data structure.. What do i need to make thread safe

I have a tree data structure. In addition to the normal insert and removal functions I have a COMPUTE function which computes certain values from the nodes present in the tree. The insert and removal functions affect the tree. But the COMPUTE function does not modify the tree. The COMPUTE functi...

1

votes

1

answer

443

Views

### Neighbour Component Analysis implementation in Octave/Matlab

I've been trying to implement the Neighbourhood Component Analysis (NCA) algorithm in Octave, but apparently there's something wrong with my code and I cannot figure out what it is.
Note: I am using Carl Edward Rasmussen's minimize function for maximization of the negative f.
Note 2: The test data I...

1

votes

4

answer

1.1k

Views

### Tracking GPS points and finding their nearest neighbours?

I have a list of 1 million (slowly) moving points on the globe (stored as latitude and longitude). Every now and then, each point requests a list of the 100 nearest other points (with a configurable max range, if that helps).
Unfortunately, SELECT * SORT BY compute_geodetic_distance() LIMIT 100 is t...

1

votes

1

answer

1.8k

Views

### Scikit-learn - user-defined weights function for KNeighborsClassifier

I have a KNeighborsClassifier which classifies data based on 4 attributes. I'd like to weight those 4 attributes manually but always run into 'operands could not be broadcast together with shapes (1,5) (4)'.
There is very little documentation on weights : [callable] : a user-defined function which a...

1

votes

2

answer

204

Views

### Faster ways to search points close to a line

I have a set of points and a line in 2D space. I need to find all points that lie within a distance D from the line. Is there a way for me to do this without having to actually compute distances di of all points from the line? Is there a solution better than linear search?
Edit: I need to search thr...

1

votes

1

answer

2k

Views

### Traveling Salesman - 2-Opt improvement

So I've been looking for an explanation of a 2-opt improvement for the traveling salesman problem, and I get the jist of it, but I don't understand one thing.
I understand that IF two edges of a generated path cross each other, I can just switch two points and they will no longer cross. HOWEVER - I...

1

votes

1

answer

6k

Views

### Find nearest neighbor names from KKNN package

I have been trying to build this program or find out how to access what KKNN does to produce its results. I am using the KKNN function and package to help predict future baseball stats. It takes in 11 predictor variables (previous 3 year stats, PA and level, along with age and another predictor). Th...

1

votes

1

answer

783

Views

### How To Write Distance LINQ query i have Lat, Long in table with varchar type SQL Server

How to write a LINQ query for nearest points calculation?
I have Lat and Lng in a table saved as string in a SQL Server database.
Found one solution on stackoverflow the query look like this
var coord = DbGeography.FromText(String.Format('POINT({0} {1})', latitude.ToString().Replace(',', '.'), longi...

1

votes

1

answer

1.7k

Views

### k nearest neighbor in SAS: how to get the neighbor list for each row?

currently I'm using the proc discrim in SAS to run a kNN analysis for a data set, but the problem may require me to get the top k neighbor list for each rows in my table, so how can I get this list from SAS??
thanks for the answer, but I'm looking for the neighbor list for each data point, for examp...

1

votes

1

answer

81

Views

### k-neighbors of a given matrix M x N

I am trying to write a code that given a position (x,y) of a matrix of integers, I could iterate around all neighbors (left, right, up, down and diagonals) of distance K, this way:
K = 1
01 02 03 04 05 06
07 08 09 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
for i in range(M):
for...

1

votes

1

answer

637

Views

### More efficient solution for finding smallest k elements in List

In below example I'm attempting to return smallest 2 elements (nearest neighbours) for which 'a' is a member of.
So smallest two elements for 'a' based on :
List((('a','b'),1.0) , (('a','c'),4.0) , (('a','c'),3.0) , (('b','c'),2.0) )
is
List((('a','b'),1.0) , (('a','c'),3.0))
Here is my solution :...

1

votes

2

answer

240

Views

### c++ Finding closest four of a set of points [closed]

I have a set of points, each with an x and y coordinates. I would like to find the 4 of these points that are closest to together (if plotted all the points would be at different locations, but 4 of these points are always closer to each other, and I want to be able to identify which of the points t...

1

votes

1

answer

676

Views

### Scaling an Image Byte Array via Nearest Neighbor

I am trying to scale an image by using a byte array. My plan was to use the nearest neighbor to find the pixel data for the new byte array but I am having some issues transforming the source data of my image. srcImage is of type Image and I am able to successfully convert it to a byte array and conv...

1

votes

1

answer

821

Views

### example from LSHForest, results not convinced

The library and corresponding documentation is following -- yes i read everything and being able to 'run' on my own codes.
http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.LSHForest.html
But the results do not really make sense to me, so i went through the example (which is includ...

1

votes

2

answer

265

Views

### PL/PGSQL always returns array or list of arrays

Given the simple pl/pgsql function:
CREATE OR REPLACE FUNCTION foo (point geometry
, OUT _street text
, OUT _gid int
, OUT distance real)
AS $$
BEGIN
SELECT min(distance(point,geom)) as dist, gid, name into distance, _gid, _street
from streets
where geometria && Expand(point,0.001) group by gid, n...