# Questions tagged [knn]

353 questions

2

votes

0

answer

19

Views

### What's the most optimal query in PgSql to find the nearest neighbor in a huge dataset?

I have a huge table (about 40 Millions rows) called nearest_spot representing lines (in the linestring format) and the closest spot they're to (there is about 1500 different spot, stored in another table). The nearest_spot table comes like this :
data_id || spot_id || spot_name || link_geom
Where d...

1

votes

2

answer

269

Views

### Correct implementation of weighted K-Nearest Neighbors

From what I understood, the classical KNN algorithm works like this (for discrete data):
Let x be the point you want to classify
Let dist(a,b) be the Euclidean distance between points a and b
Iterate through the training set points pᵢ, taking the distances dist(pᵢ,x)
Classify x as the most frequ...

1

votes

1

answer

50

Views

### How can I find nearest neighbors of points in a data frame from another data frame

I want to find k nearest neighbors of all points in dataframe A from a dataframe B. How is that doable?
It seems sklearn.neighbors.NearestNeighbors takes only one set of data, and just one query point.
Like:
samples = [[0., 0., 0.], [0., .5, 0.], [1., 1., .5]]
from sklearn.neighbors import NearestNe...

0

votes

0

answer

5

Views

### Revised the KNN model on Iris

I tried to use KNN (applied Euclidean distance) to predict and get accuracy on Iris data without using scikit-learn. However, I have no idea to process the next step.
Regression: The summarization of the closest instances could involve taking the mean of the predicted attribute to revise the model....

1

votes

0

answer

76

Views

### the implementation of lazy multi label classifiers in Mulan

I want to use k nearest neighbor for multi label classification. there are some classifiers based on knn which are implemented in mulan library, or are written in C or Matlab such as MLKNN.
when I use the same classifier for numeric dataset I get identical result,
but for nominal dataset such as sla...

1

votes

0

answer

1.2k

Views

### Issue defining KneighborsClassifier in Jupyter Notebooks

I am attempting to utilize KNN on the Iris data set as a 'Hello World' of Machine Learning. I am using a Jupyter Notebook from Anaconda and have been clearly documenting each step. A 'NameError: name 'knn' is not defined' exception is currently being thrown when I attempt to use knn.fit(X,Y) What...

1

votes

0

answer

374

Views

### Machine learning - PCA and KNN on rgb images is too slow (python)

I work with python and images of tables (taken from above). My aim is to take a photo of a random table and then find the most similar tables to it in my database. Obviously, the main feature which distinguishes the tables are their shape (square, rectangular, round, oval) but there are also other d...

1

votes

1

answer

107

Views

### Design L1 and L2 distance functions to assess the similarity of bank customers. Each customer is characterized by the following attribute

I am having a hard time with the question below. I am not sure if I got it correct, but either way, I need some help futher understanding it if anyone has time to explain, please do.
Design L1 and L2 distance functions to assess the similarity of bank customers. Each customer is characterized by th...

1

votes

1

answer

498

Views

### When I use pd.crosstab it keeps showing AssertionError

When I use pd.crosstab to build confusion matrices, it keeps showing
AssertionError: arrays and names must have the same length
import pandas as pd
import numpy as np
from sklearn.metrics import confusion_matrix
import random
df = pd.read_csv('C:\\Users\\liukevin\\Desktop\\winequality-red.csv',sep=...

1

votes

1

answer

757

Views

### “Can not convert a ndarray into a Tensor or Operation” using pandas dataframe in tensorflow

I am learning to do some basic Tensorflow and I ran into some problem. I am trying to load data from a file using Pandas, and then perform K-nearest neighbor on the dataset, However, I keep running into problem
It seems that Tensorflow does not work with ndarray from numpy, I am stuck here for two...

1

votes

2

answer

92

Views

### knn algorithm output one result per test data

I used knn to do a basic predictive model build.
After I run:
predictions = knn(train,test,cl,k=3)
and then output predictions, the R console has a dozen results per row.
Such as:
[1] Yes No Yes Yes Yes No Yes Yes Yes No No
[12] No Yes Yes Yes No No No Yes Yes Yes No
etc etc for 10000 rows.
I need t...

1

votes

1

answer

284

Views

### Providing user defined sample weights for knn classifier in scikit-learn

I am using the scikit-learn KNeighborsClassifier for classification on a dataset with 4 output classes. The following is the code that I am using:
knn = neighbors.KNeighborsClassifier(n_neighbors=7, weights='distance', algorithm='auto', leaf_size=30, p=1, metric='minkowski')
The model works correctl...

1

votes

0

answer

208

Views

### How to find n nearest neighbors using a built-in function in R

I have a dataset with a following structure:
Number of Rows = 1000
Each row corresponds to an article
Number of Columns = 502
Column 1 : Title of the article
Column 2 : Text of the article
From column 3 to column 502 : Each column corresponds to a unique word appearing in the corpus of all the art...

1

votes

0

answer

27

Views

### Enter values of attributes of a new example to predict its label to RapidMiner Process using Netbeans

i have a simple KNN process which takes an excel dataset as the training data, i can run it from inside NetBeans and display the accuracy of the model generated by it , my question is : How can i enter values for a new example 'without entering the label value of course which i want to predict' from...

1

votes

1

answer

474

Views

### Implementing KNN with different distance metrics using R

I am working on a dataset in order to compare the effect of different distance metrics. I am using the KNN algorithm.
The KNN algorithm in R uses the Euclidian distance by default. So I wrote my own one. I would like to find the number of correct class label matches between the nearest neighbor and...

1

votes

0

answer

32

Views

### How can I determine classes in k-nn

My problem is as follows: I have one unknown value and two classes. I calculate the distances from an unknown value to every known value, and I make one matrix. But now I have to take the three smallest distances and I must determine their class, and I don't know how I can take these three values an...

1

votes

0

answer

109

Views

### Nearest Neighbor matching with replacement Python

I have 2 dataframes, df_test and df_control. For each element in df_test, I am looking for the closest match in df_control based on a feature_list.
I have seen the NearestNeighbors function in scikit-learn (also this answer). However, this function does not have an option for sampling without replac...

1

votes

1

answer

133

Views

### Converting from dgCMatrix/dgRMatrix to scipy sparse matrix

I am working on the netflix data set and attempting to use the nmslibR package to do some KNN type work on the sparse matrix that results from the netflix data set. This package only accepts scipy sparse matrices as inputs, so I need to convert my R sparse matrix to that format. When I attempt to do...

1

votes

0

answer

25

Views

### kNN to predict LPR

I am doing a License Plate Recognition system using python. I browsed through the net and I found many people have done the recognition of characters in the license plate using kNN algorithm.
Can anyone explain how we predict the characters in the License Plate using kNN ?
Is there any other algori...

1

votes

0

answer

26

Views

### Export or Create a Dataset of the Imputed or Transformed Varibles Python

I was running KNN for my dataset, for which I had to impute the missing values and then transform the variables so that they can lie between 0 and 1.
I have to use this predicted results as inferred performance and make a TTD Model for the same.
When I use predict I can get the predicted probabiliti...

1

votes

0

answer

230

Views

### Do we have kNNdistplot in Python

There is a function called kNNdistplot to draw the k-distance plot.
Do we have a (similar) kNNdistplot function in Python?

1

votes

1

answer

130

Views

### MapReduce-KNN for Hadoop - run multiple test cases from one data file

Background: [Skip ahead to next section for exact problem]
I am currently working on Hadoop as a small project in my University (not a mandatory project, I am doing it because I want to).
My plan was to use 5 PCs in one of the labs (Master + 4 Slaves) to run a KNN algorithm on a large data set to fi...

1

votes

0

answer

151

Views

### KNN model gets 100% accuracy - should I trust the results?

So I'm playing with the iris dataset and no matter what my value of k in the knn algorithm, I'm getting 100% accuracy. Have I gone wrong somewhere?
Here's my code using the in-built iris data frame.
library(caret)
set.seed(52)
irissplit

1

votes

0

answer

98

Views

### Implementing Knn using DTW as a distance measure in R programming language

I am trying to implement Knn using dtw as distance measure in R. Below is the code am trying to implement
## KNN + DTW
knn

1

votes

0

answer

17

Views

### Which result should I choose if the knn result returns the same quantity

I have run a KNN=5, and the result contains:
2: upgrade
2: remains
1: Downgrade
I got 2 'upgrade' & 2 'remains'.
How do I decide on the expected results?
Upgrade or remain?

1

votes

0

answer

76

Views

### KNN Regression results in zero MSE on training set (sklearn)

Using sklearn and trying to evaluate a KNN regression function with the below code:
def cross_validate(X,y,n_neighbors, test_size=0.20):
training_mses = []
test_mses = []
n = X.shape[ 0]
test_n = int( np.round( test_size * n, 0))
indices = np.arange(n)
random.shuffle( indices)
test_indices = indices...

1

votes

0

answer

88

Views

### Feed CNN features to KNN algorithm in python

I want to build an end-to-end trainable model that do:
extract features from face image by using CNN
apply KNN algorithm on the previously extracted features
I build my cnn network using python3.5
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=4)
X_train = X_tr...

1

votes

1

answer

28

Views

### WEKA IBk wrong Results for EditDistance (Levenshtein distance) - JAVA

im very new to WEKA and tried today an IBk-Algorithmn to classfify strings to different classes by the distance function Levenshtein-Distance. However I'm getting very bad results. My inputs are always getting assigned the same class (Class b) which is not correct at all. Can somebody tell me what i...

1

votes

0

answer

36

Views

### How to classify patterns using KNN in python

I have huge (around 700,000 records) CSV file of the following format
t,X,Y
0.00065,0,10
0.000795,0,12
0.039068,2,13
0.03913,4,17
0.039901,4,10
0.039925,5,21
0.039945,6,22
0.039961,7,25
0.040875,9,27
0.040915,9,31
0.041167,11,33
0.041203,12,34
0.139602,6,41
0.139687,13,87
0.139727,13,87
From column...

1

votes

0

answer

42

Views

### How to use SelectKBest for a classification problem

I'm doing a classification Problem.
Assume this is my train and test dataset.
X_train.shape y_train
(7500, 5760) (7500,)
x_test.shape y_test
(2500, 5760) (2500,)
After using knn classifier I got an accuracy of 0.74%.
Now I want to select top 1000 features and again use kn...

1

votes

0

answer

56

Views

### Image -> Connectivity Matrix -> Networkx Graph (with nodes as pixel coordinates of original image)

I have a binary image and I want to create a knn graph to get the image outline (in order to reduce noise eventually by averaging shortest edges). I'm using sklearn.neighbors.kneighbors_graph to create an connectivity matrix and then I am feeding that adjacency matrix into networkx to create a visua...

1

votes

0

answer

21

Views

### Ind function for KNN

it's my first contact with R (or any other language) and i'm trying to understand these rows for a KNN project:
#Data partition
ind

1

votes

2

answer

44

Views

### I get the error NAs introduced by coercionNAs when trying to run kNN in R?

I am trying to run kNN on a dataset but I keep getting some NA error. I have exhausted stack overflow trying to find a solution to this problem. I could not find anything useful anywhere.
This is the dataset I am working with : https://www.kaggle.com/tsiaras/uk-road-safety-accidents-and-vehicles
I h...

1

votes

1

answer

23

Views

### Invalid shape error while using Knn Classfier

Following are the X and Y variable shapes:
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=42)
## Output for shapes
X_train.shape = (970, 298)
X_test.shape = (478, 298)
len(y_train) = 970
len(y_test) = 478
Now I assign Multi-output classifier from K...

1

votes

2

answer

182

Views

### Dynamic closest elements

I have a 2D surface ( Grid ) with 50 elements at different locations.
I need to decide which are the 10 closest elements to a given point.
In addition, the given point is constantly moving and i need to do the calculation on each movement.
I know I can calculate the Euclidean distance to each point...

1

votes

2

answer

6.3k

Views

### How to find an accuracy of a classifier

I am using the KNN classifier and I found the knnclassify does the classification for me in MATLAB.
code:
Class = knnclassify(TestVec,TrainVec, TrainLabel);
The problem I face now, knnclassify just classifies the points and gives them a value but I would like to find the accuracy of this classifica...

1

votes

1

answer

321

Views

### k-nearest neighbour script for each data point

I have a large set of features which looks like this:
x1 28273 20866 29961 27190 31790 19714 8643 14482 5384 .... upto 1000
x2 12343 45634 29961 27130 33790 14714 7633 15483 4484 ....
x3 ..... ..... ..... ..... ..... ..... .... ..... .... ....
.
.
.
.
x1000 .... .... ... .. . . . ....

1

votes

1

answer

1.9k

Views

### Find the classification rate of testing data

I need to use KNN search to classify the testing data and find the classification rate.
Below is the matlab code:
for example:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
load fisheriris
x = meas(:,3:4); % x =all training data
y = [5 1.45;6 2;2.75 .75]; % y =3 testing data
[n,d] = knnsearch(x,y,'k'...

1

votes

1

answer

144

Views

### Find nearset neighbours in large set

I have large set of points in multidimensional space. And I'd like to find few neighbours (within the neighborhood) for any given point (require is to avoid scanning of all points).
I want to know if my solution is appropriate:
Pre processing:
Define set of ortogonal axises
Make a projection of each...

1

votes

2

answer

1.3k

Views

### k-Nearest Neighbour Algorithm

I am implementing the k-Nearest Neighbour algorithm on my smart device in order to recognize human activities from recognition data. I am going to explain how am I going to implement it. Can you guys, suggest me with any improvements on the steps that I am taking and answer any question that I might...