Questions tagged [classification]

1

votes
1

answer
525

Views

Trading in precision for better recall in Keras classification neural net

There's always a tradeoff between precision and recall. I'm dealing with a multi-class problem, where for some classes I have perfect precision but really low recall. Since for my problem false positives are less of an issue than missing true positives, I want reduce precision in favor of increasin...
megashigger
1

votes
2

answer
269

Views

Correct implementation of weighted K-Nearest Neighbors

From what I understood, the classical KNN algorithm works like this (for discrete data): Let x be the point you want to classify Let dist(a,b) be the Euclidean distance between points a and b Iterate through the training set points pᵢ, taking the distances dist(pᵢ,x) Classify x as the most frequ...
Daniel
-2

votes
0

answer
19

Views

How many number of neurons are in the first input layer model.add(Conv2D(64, kernel_size=(3, 3),input_shape=(200,200,3))

1)how many number of neurons in the input layer? I'm giving the input size of image as 200*200 2)I guess the number of neurons for input layer should be number of (features) pixels of an input image (in this case 200*200) 3)what if there are more number of neurons in the input layer than the feature...
Rohan Dhere
0

votes
0

answer
4

Views

Classification: Target with more than 2 classes

I am doing a classification exercise and facing a target with more than 2 categorical classes. I have encoded those classes using the Labelencoder. The only problem is, I believe I might have to use Onehotencoding after as I do not have only zero and 1 anymore but 0,1,2,3. The reality is, I just do...
Camue
1

votes
3

answer
50

Views

How can I solve a classification problem with a dependent variable with more than two values

I have a simple NLP problem, where I have some written reviews that have a simple binary positive or negative judgement. In this case I am able to train and test as independent variables the columns of X that contain the 'bags of words', namely the single words in a sparse matrix. from sklearn.feat...
Drocchio
1

votes
1

answer
126

Views

Machine learning algorithm score changes without any change in data or step

I am new to Machine learning and getting started with Titanic problem on Kaggle. I have written a simple algorithm to predict the result on test data. My question/confusion is, every time, I execute the algorithm with the same dataset and the same steps, the score value changes (last statement in th...
YoungHobbit
0

votes
1

answer
13

Views

Text Classification Approach

I have data with 2 important columns, Product Name and Product Category. I wanted to classify a search term into a category. The approach (in Python using Sklearn & DaskML) to create a classifier was: Clean Product Name column for stopwords, numbers, etc. Create 90% 10% train-test split Convert text...
user519326
1

votes
2

answer
46

Views

How to perform SMOTE with cross validation in sklearn in python

I have a highly imbalanced dataset and would like to perform SMOTE to balance the dataset and perfrom cross validation to measure the accuracy. However, most of the existing tutorials make use of only single training and testing iteration to perfrom SMOTE. Therefore, I would like to know the correct...
Emi
0

votes
0

answer
4

Views

Training The Binary Image classifer (cats/dogs) with one more class

I am using the following code: https://github.com/llSourcell/how_to_make_an_image_classifier/blob/master/demo.ipynb Where he does a simple implementation of cats vs dogs. I have used my own imageset which consists of just 240 training images and 50 test samples and I have tried this same model. The...
Teja8484
1

votes
0

answer
406

Views

features selection for large dataset in python

I have a Document-term matrix of dimension 3144469 x 268496 for which i need to do feature selection.I tried it doing with feature selection of Sckit-learn using code fs = feature_selection.SelectPercentile(feature_selection.chi2, percentile=40) documenttermmatrix_train= fs.fit_transform(documentter...
Ranjana Girish
1

votes
0

answer
43

Views

Neural Network Classification with several input of different type

I've started to study neural networks and now I'm learning to use them to classify objects. But I have 1 doubts: How should I represent the input array if the inputs have different type (e.g. number and string)? For example if i have to classify an apartment (array with the same type (all int)),...
Jacob
1

votes
1

answer
106

Views

When neural network loss is descending but the accuracy is not increased?

I implement a batch-based back-propagation algorithm for a neural network with one hidden layer and sigmoid activation function. The output layer is one-hot Sigmoid layer. The net of first layer is z1. After apply sigmoid it becomes a1. similarly, we have z2 and a2 for the second layer. The back-pr...
MohsenIT
1

votes
1

answer
713

Views

Multiclass classification of text in R

I have build a random forest for multiclass text classification. The model returned an accuracy of 75 %. There are 6 labels, however out of the 6 classes, only 3 are classified and rest are not classified. I would really appreciate if anyone could let me know what went wrong. Below are the steps i f...
Karthik Shanmukha
1

votes
1

answer
122

Views

Analyze Data Set on WEKA

I'm new to WEKA and I would ask you if anyone can help me to understand if i'm using WEKA correctly. 1) I have a Dat set including 11377 record classified as follows: 11111 records have class YES 266 records have class NO (For some reason, i can use only J48 algorithm for classification) When I sele...
IvanC
1

votes
0

answer
72

Views

How TF-IDF handles missing values?

I am working on a classification problem in which I have to classify product category based on the information of the product like title, description and other attributes. It is working for different categories but getting biased in closed categories like mobile and mobile accessories. Let's say I h...
Sam
1

votes
0

answer
76

Views

the implementation of lazy multi label classifiers in Mulan

I want to use k nearest neighbor for multi label classification. there are some classifiers based on knn which are implemented in mulan library, or are written in C or Matlab such as MLKNN. when I use the same classifier for numeric dataset I get identical result, but for nominal dataset such as sla...
niloofar
1

votes
0

answer
514

Views

Is passing sklearn tfidf matrix to train MultinomialNB model proper?

I'm do some text classification tasks. What I have observed is that if fed tfidf matrix(from sklearn's TfidfVectorizer), Logistic Regression model is always outperforming MultinomialNB model. Below is my code for training both: X = df_new['text_content'] y = df_new['label'] X_train, X_test, y_train,...
ZEE
1

votes
1

answer
818

Views

How do you do ROI-Pooling on Areas smaller than the target size?

I am currently trying to get the Faster R-CNN network from here to work in windows with tensorflow. For that, I wanted to re-implement the ROI-Pooling layer, since it is not working in windows (at least not for me. If you got any tips on porting to windows with tensorflow, I would highly appreciate...
Martin
1

votes
0

answer
552

Views

Text Categorization by uisng mlr package in R

I need to train a model which would perform multilabel multiclass categorization on text data. Currently, i'm using mlr package in R. But unluckily I didn't proceed further because of the error I got it before training a model. More specifically I'm stuck in this place: classify.task = makeMultilabe...
Ahsan Nawaz
1

votes
2

answer
1.1k

Views

Keras Dense Net Overfitting

I am attempting to use keras to build an activity classifier from accelerometer signals. However, I am experiencing extreme overfitting of the data even with the most simplistic of models. The input data is of shape (10,3) and contains roughly .1 second of data from the accelerometer in 3 dimension...
ihunter2839
1

votes
0

answer
104

Views

Python classification technique naive bayes

I am doing a research on classification techniques. I found a code online for Naive Bayes Classification in python. I have shared the code below. But I am getting errors in it. Please help in solving the errors. The software I am using is Anaconda with Python 3.6 in it. The code is as follows: impo...
1

votes
0

answer
74

Views

Suggestions for block sizes in dask for my matrix (doing onehot and classification)

I am new to using dask, although I have experience in parallel computing and other libraries. I was wondering if someone had good suggestions about which block sizes I should use. I have the done the following workflow previously in memory using scikit-learn with a smaller matrix. I would like to...
SWZ
1

votes
0

answer
247

Views

tensorflow text classification using softmax

I'm new to both tensorflow and machine learning and I'm playing with the Enron dataset to classify the top 10 senders. I found some nice examples in kaggle that uses scikit-learn and that works but when I try the same with tensorflow the accuracy is embarrassingly bad. Below is what i'm doing Load t...
as_ibis
1

votes
1

answer
432

Views

How do I properly use the weight_column when working with tf.estimator.DNNClassifier in Tensorflow (or how do I make a biased cost function)?

I am using https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier Let's say I have a Classification problem. Attempting to classify 2 things. Class1 is Happy Face, Class2 is Not Happy Face. In this particular scenario, when looking at 1,000+ samples every day, I just want to grab the...
mschmidt42
1

votes
0

answer
26

Views

How to select features when you have image(pixels) with extra information(categories)?

Suppose you need to train your classifier on a dataset that has images as well as more descriptor features available (along with the labels of-course). For eg. if you have to classify cats vs dogs, and you are provided with the image, weight and age of each animal. If I just had the image, I could...
Dhruv Batheja
1

votes
0

answer
132

Views

Sklearn MultinomialNB gives 1 probability for some class for few examples?

I used MultinomialNB from sklearn for some text data. Data contains 12 class. And its classification task. After applying MultinomialNB with CounterVectorizer i checked few example's predicted class probability.And for some reasons one class shows 1.0 probability. [[ 3.91049692e-23 , 2.50074669e-...
Poojan
1

votes
0

answer
41

Views

How to choose training data from a satellite imagery for supervised classification?

I am performing supervised classification of Sentinel 2 imagery using a Random Forest Classifier. I wish to select the training data from the image. Could anyone please tell me the method to efficiently perform this?
Shivam Pande
1

votes
0

answer
400

Views

Using weighted_cross_entropy_with_logits for multilabel sparse classification

I have a multilabel classification problem where each tuple in the training data set is labeled with one or more class and the number of classes in the data set is large ~500, resulting in sparse target vectors as [1, 0, 0, ..., 1, 0, 0, ...]. I am using Keras with Tensorflow backend to build the cl...
Reham M Samir
1

votes
0

answer
39

Views

How to deploy scikit-learn classifier model into ANN written in another language

I have a scikit-learn classifier: MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08, hidden_layer_sizes=(13, 13, 13), learning_rate='constant', learning_rate_init=0.001, max_iter=500, momentum=0.9, nesterovs_momentum=True,...
Robert Boyer
1

votes
1

answer
211

Views

Using fitctree to train a more sensitive model with an imbalanced training set

I'm trying to build a decision tree in MATLAB for binary classification. I have 4 features for each instance. There are around 25,000 instances in the positive class and 350,000 instances in the negative class. I've tried building classifiers both within the classification learner app and using fit...
user3470496
1

votes
0

answer
237

Views

Do scikit-learn classifiers automatically one-hot encode?

I'm confused by the behavior of the fit methods for the scikit-learn classifiers. I'm preprocessing my array that identifies the classes such that they are one-hot encoded, e.g, the shape is (n_samples, n_classes). However, when I try to use algorithms like SVC or logistic regression, I get the foll...
jonabuck
1

votes
0

answer
82

Views

Python: Pipeline use some result of first classifier to second classifier (sklearn)

I want to use GaussianNB to classify into category A / B then use MultiNomialNB to classify only type A into sub-categories a1/ a2/ a3 My question is how can I insert another first classifier into pipeline and use only A result to be input of second classifier? what I have now: pipeline1 = Pipeline...
patppd
1

votes
0

answer
125

Views

R - Document-context matrix from dtm-tf and word embeddings

I have a term-frequency, document-term matrix (dtm-tf) in which each row is a document, each column is a term, and each number in the matrix represents the number of occurences of the term in the document. I also have a term-context matrix (a matrix of word vectors/embeddings) where each row is a te...
Christopher Costello
1

votes
0

answer
109

Views

Document representation with pre-trained Word Vectors for Author Classification/Regression (GP)

I am trying to replicate (https://arxiv.org/abs/1704.05513) to do a Big 5 author classification on Facebook data (posts and Big 5 profiles are given). After removing the stop words, I embed each word in the file with their pre-trained GloVe word vectors. However, computing the average or coordinate-...
qwazy
1

votes
0

answer
115

Views

Extracting information from sentence : NER or other ways?

What I'm trying to do now is extracting 'customers'names' from a firm's disclosure text. What I have done up to now stated as below: Classify every sentences in disclosure data whether it contains information about its customers or not by machine learning(1 if it contains customer data, 0 if not) So...
ChanKim
1

votes
0

answer
443

Views

Custom loss function Keras backend

I'm getting stuck at implementing a custom loss function that should measure the Recall of the classified data. for a more detailed problem description, see: Classification: skewed data within a class I have implemented it with numpy arrays, but how would one translate this to Keras-backend? Does an...
BugridWisli
1

votes
0

answer
31

Views

How can use libsvm for multiclass pixel-based classification in matlab?

I'm working with libsvm and I must perform a multiclass pixel-based classification. I want to classify an image which contains Four classes. For training, I have extracted SURF dense features for each class and out it them in Data_Train.xlsx in which first column is the class and the rest is the SUR...
PhD Ma
1

votes
0

answer
335

Views

tf-slim resnet pretrained model can't get correct results

I am using pretrained resnet50 model provided by tensorflow slim. When I am using this model to inference, I can't get correct result. Does anyone can help me to solve problem? The follow is the code I used to do inference. The image preprocess method following this issue ResNet pre-processing: VGG...
auroua
1

votes
1

answer
382

Views

C# Accord.net. text classfication

I have unknown number of columns in my TFIDF vector. my clasificaton code is: double[][] inputs = table.ToJagged('ColumnName1','columnName2'); int[] outputs = table.Columns[2].ToArray(); var teacher = new NaiveBayesLearning(); var nb = teacher.Learn(inputs, outputs); i don't know how to pass unknown...
Heights
1

votes
0

answer
257

Views

Pipeline : add another feature to text classification in Python (FeatureUnion)

I am attempting to implement a text classification solution using scikit learn. I have been able to get results for simple classification of text. Now I want to add another feature (non-text) into the prediction process - to improve accuracy. My data-set is as follows : label : the target value i.e,...
borb183

View additional questions