Questions tagged [computer-vision]

1

votes
1

answer
93

Views

OpenCV: Merging fitted shapes

If I'm using OpenCV (Python) and fit two shapes, like so: a = cv2.fitEllipse(contours) b = cv2.minAreaRect(contours) Both a and b are represented as Box2D objects, which look something like: center: (x, y) size: (width, height) rotation: angle a and b are often going to be fairly similar, but not ex...
Jordan
0

votes
0

answer
6

Views

I need some suggestion to move forward with my image recognition task

I am currently working on an image recognition problem where I would like to recognize images with the highest probability, meaning the expectation is to match an image having a maximum percentage of match score from the pool of images given input test images. I want any ideas, suggestion or any blo...
user11409134
0

votes
0

answer
2

Views

Lane detection with brightness change and shades on lanes?

I am currently working on a lane detection project, where the input is an RGB road image 'img' from a racing game, and the output is the same image annotated with drawn colored lines on detected lanes. The steps are: Convert the RGB image 'img' to HSL image, then use a white color mask on it (white...
Mohamed EL Tair
0

votes
0

answer
4

Views

Should YOLOv3 annotations be done before the resize?

I am about to start annotating my images to train a YOLOv3 model. Before starting I want to make sure that it is okay to create the annotations on the original image. Would the annotations change respectively after I resize my images before training? Or should I resize all of my images first then st...
sugar.darre
1

votes
2

answer
3.9k

Views

How to change invert frames in EmguCV?

Currently, I am working on a real-time IRIS detection application. I want to perform an invert operation to the frames taken from the web camera, like this: I managed to get this line of code, but this is not giving the above results. Maybe parameters need to be changed, but I am not sure. CvInvoke....
gouthaman93
1

votes
1

answer
380

Views

Canny Edge vs Thresholding for contour estimation in Open CV

I am using Open CV for an image processing application that involves contour estimation in images. What I would like to know is whether Thresholding the image (like how they have done here) or using Canny Edge Algorithm (here) yields a better result. Does this involve algorithmic analysis or am I m...
rhino2rhonda
1

votes
2

answer
1.3k

Views

how does SLAM extract landmarks?

In the 'slam for dummies' tutorial, laser scanner was used, and two methods of landmark extraction were shown. But most practical SLAM implementations are based on camera images. In these applications how are landmarks extracted? The Durrant-Whyte paper does not mention it and I could not find an en...
teddy teddy
0

votes
0

answer
4

Views

Why is the model not learning with pretrained vgg16 in keras?

I am using the pre-trained VGG 16 model available with Keras and applying it on the SVHN dataset which is a dataset of 10 classes of number 0 - 10. The network is not learning and has been stuck at 0.17 accuracy. There is something that I am doing incorrectly but I am unable to recognise it. The way...
Amanda
-1

votes
0

answer
5

Views

What type of machine learning or AI Model can I use for Factor Ranking

What type of machine learning or AI Model can I use for Factor Ranking? I have some factors and am trying to rank them based on how they are able to predict in my model please what kind of machine learning or AI or Deep Learning Model work for this?
tplshams
1

votes
1

answer
740

Views

How to train a FCN network while the size of images are not fixed and they are varying?

I have already trained the FCN model with fixed size images 256x256. Could I ask from experts how can I train the same model once the size of image are changing from one image to another image? I really appreciate your advice. Thanks
S.EB
1

votes
1

answer
47

Views

How does np.outer help in creating a filter kernel?

I was trying the filter2D function with OpenCV using my own kernel: kernel = np.array([1,3,4,5,2]) / 11 cv2.filter2D(img, -1, kernel) and it works fine. I also saw a snippet where the same thing was done as follows: kernel = np.array([1,3,4,5,2]) / 11 kernel = np.outer(kernel, kernel) cv2.filter2D(i...
Sophia
1

votes
1

answer
33

Views

Using gaussian blur with zero size kernel?

I was reading about Gaussian Blur when one of the examples I came across was as follows: cv2.GaussianBlur(img, (0,0), 5) What does 0,0 mean here? This is in sharp contrast to another example I read: cv2.GaussianBlur(img,(5,5),0) How are both different from each other?
Harris Pat
1

votes
1

answer
70

Views

What does the TM_CCORR and TM_CCOEFF in opencv mean?

What does the TM_CCORR and TM_CCOEFF in opencv mean? I found that TM_CCORR stands for the correlation coefficient. However, the TM_CCOEFF seams also to be the correlation coefficient due to its naming. Do you know for what the abbrevations stand? TM_SQDIFF = Template Matching Square Difference TM_CC...
Rene B.
1

votes
1

answer
263

Views

Binary Segmentation of an complex image in python. [closed]

I was trying to implement a binary segementation for my project work but I got stucked with the binary segmentation code. I just want to a get a continuos segmentation of a player and background in the picture like this image (A proper segmentation of a hand). imgort numpy as np impo...
kanik garg
1

votes
0

answer
640

Views

Applying a specific High pass filter on a RGB image in numpy

I'm trying to preproccess my image before feeding it to the CNN. Goal To extract the residual after applying a high pass filter( Reference 1 ) on a RGB image of dimensions 512x512 ( basically a shape of (512,512, 3) ) using the following equation: link to image where I is the Image and the matrix is...
Pradyumna Rahul
1

votes
0

answer
72

Views

Is there a way to cancel a Vision Framework request (VNDetectTextRectanglesRequest) once triggered?

I'm running a Vision Framework request for an iOS app like follows: let textDetectionRequest = VNDetectTextRectanglesRequest(completionHandler: self.findTextBoxes) let textDetectionHandler = VNImageRequestHandler(cgImage: image, orientation: orientation, options: [:]) DispatchQueue.global(qos: .back...
Andre Guerra
1

votes
0

answer
85

Views

Custom Imagenet Dataset

I want to make my own custom dataset of imagenet taking 20 classes out of 1000. I am able to convert it to TFRecords format but I want a keras friendly version for it as keras is more flexible for me.
Ayush Agarwal
1

votes
0

answer
331

Views

Why does Keras model.predict() result in different probabilities based on the size of testing data?

I'm relatively new to Keras and image classification in general and I'm running into an issue that I can't seem to find much information on. So the gist of it is that I've written a slightly modified version of the resnet50 architecture and am testing it on my own training dataset of 5000 images. T...
csblue09
1

votes
0

answer
339

Views

Epipolar Geometry, Not Visually sane output in OpenCV

I've tried using the code given https://docs.opencv.org/3.2.0/da/de9/tutorial_py_epipolar_geometry.html to find the epipolar lines, but instead of getting the output given in the link, I am getting the following output. but when changing the line F, mask = cv2.findFundamentalMat(pts1,pts2,cv2.FM_LME...
K.H
1

votes
0

answer
41

Views

Image processing : Extract background class from annotated images

Given an image of dimensions aXb (see an example below), and a few rectangular annotations for different classes, what is the most effective (or non-brute-force) method for finding non-overlapping rectangular annotations (at least m X n dimension) not covered by the the already provided annotations?...
GKS
1

votes
0

answer
666

Views

detect orientation of sub image

i have a scanned form. but, some times, the scanned forms are skewed. so, i want to automatically detect the orientation of the scanned pdf, rotate it so that it text in it will be exactly horizontal with minimal error. My idea is described as below: The Form has few printed logos. so, what i thoug...
InAFlash
1

votes
0

answer
402

Views

defining a siamese network in tensorflow

I know this question has been asked before, however I've a specific question which has not been answered before. I am trying to define a Siamese network in Tensorflow as follows: def conv(self, x, num_out_maps, ksize, stride, activation_fn=tf.nn.relu): padding_length = np.floor((ksize-1)/2).astype(n...
kunal18
1

votes
1

answer
70

Views

labels should start from 0 or 1 in caffe if use softmaxwithLoss as loss layer?

Two questions: 1.For MultinomialLogisticLoss, the label should start from 1, but the caffe doc says it should start from 0.http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1SoftmaxWithLossLayer.html 2.The shape of score(bottom[0]) is NCH*W, and N*1*1*1 for label(bottom[1]) for MultinomialLogist...
spider
1

votes
0

answer
59

Views

My inverse compositional homograhy image align cannot converge?

I implement the algorithm based on homographyTrackingDemo and LK-20-years. It's adaption of inverse compositional LK algorithm for estimating affine transformation. There are several steps I have tried: Derive the derivative of template image; Derive the derivative of homography transformation on t...
zskalibur
1

votes
1

answer
1.1k

Views

Apple Vision framework and body recognition

I have questions related to body shape recognition/measurement and the iOs frameworks: Would the Vision framework combined with Core ML be able to recognize a body shape from a still Image? If yes, would the Vision framework have functions to process the image and establish precise measurements of t...
troyer
1

votes
1

answer
180

Views

OpenCV Detecting Multiple, Rotated, Scaled objects

I've only been using OpenCV for 12 hours or so and haven't been able to solve this problem. The end goal is to take an image and store each character as an entry inside of 6 separate vector2 arrays (5 chars + bubbles) Additionally, I need to know whether a character is 'enlarged' or not. Link to res...
Travis Foster
1

votes
1

answer
523

Views

Adaptive Canny Edge Detection Algorithm

I am trying to implement Canny Algorithm using python from scratch. I am following the steps Bilateral Filtering the image Gradient calculation using First Derivative of Gaussian oriented in 4 different directions def deroGauss(w=5,s=1,angle=0): wlim = (w-1)/2 y,x = np.meshgrid(np.arange(-wlim,wlim...
Siladittya
1

votes
0

answer
94

Views

Graph cut based on visibility

According to the following figure, a surface mesh can be extracted using an Energy function = L, where L is number of intersection of a triangle. Graph Cut algorithm I would like to know how would I construct the graph and apply cut to it using C++. According to that library http://pub.ist.ac.at/~vn...
andre ahmed
1

votes
0

answer
296

Views

Input size mismatch error when using pre-trained inceptionV3 model for image classification

I'm facing trouble when training a model using pre-trained inceptionV3 for my own image data set. I'm loading images using data.Dataset loader and 'transforms' for image transformation. Here's my inceptionV3 model inceptionV3 = torchvision.models.inception_v3(pretrained=True) pretrained_model = nn.S...
Sam
1

votes
1

answer
266

Views

Handling Results from Computer Vision API

I am playing around with Azure Cognitive Services Computer Vision API and I am running into issues knowing what to do with the results. The use case is I have an image that is a photo of a calendar of events for a particular month. I am running the image through the Computer Vision API OCR Method ht...
Isaac Levin
1

votes
1

answer
1.3k

Views

Understanding and tracking of metrics in object detection

I have some questions about metrics if I do some training or evaluation on my own dataset. I am still new to this topic and just experimented with tensorflow and googles object detection api and tensorboard... So I did all this stuff to get things up and running with the object detection api and tra...
user8574993
1

votes
1

answer
504

Views

2D Image to 3D world Coordinates

We crawled a set of images from the Google Street View (GSV) API. I want to estimate the 3D World coordinate from 2D Image given the following: 1. The GPS location (i.e., latitude and longitude) of the camera capturing the image Conversion of GPS coordinates to translation matrix: Used 2 types of co...
fararjeh
1

votes
1

answer
98

Views

Problems downloading something-something datasets

I am now learning video recognition and want to use the dataset something-something published by twentybn(https://www.twentybn.com/datasets/something-something). I followed the downloading and installing tips in https://github.com/TwentyBN/twentybn-dl, which told me to download the dataset in the vi...
Hanzp
1

votes
1

answer
557

Views

How to oversample image dataset using Python?

I am working on a multiclass classification problem with an unbalanced dataset of images(different class). I tried imblearn library, but it is not working on the image dataset. I have a dataset of images belonging to 3 class namely A,B,C. A has 1000 data, B has 300 and C has 100. I want to oversampl...
ReInvent_IO
1

votes
0

answer
231

Views

How to understand RPN phrase in Faster-RCNN

Sorry, it is all by one question but relate to many small questions. I can't split them into seperated questions. For example, input picture size 960x640 Through VGG16 layer 13 Conv5_3, get feature_map 60x40x512 Do 3x3 convolution.     3.1 How 3x3 convolution compress the output above to 1x512...
Mithril
1

votes
0

answer
61

Views

Run MatConvNet on CPU with maximum number of workers

I wish to run MatConvNet on CPU (no GPU at all), with 44 number of workers in parallel computing. Which part of the codes should be modified? Any help is highly appreciated.
Faranak
1

votes
0

answer
95

Views

Effect of adding/removing layers in CNN

I wanted to know about how adding layers or removing layers in a convolutional neural network affects the results produced. My problem specifically deals with detection of lips, something like this https://img-s3.onedio.com/id-57dc3cd17669c0cf0e4e2705/rev-0/raw/s-4909b77cbbb875d94fea9406a207936f767d...
Sarthak Agarwal
1

votes
1

answer
147

Views

How does TensorFlow train kernels?

TensorFlow's API describes the function tf.nn.conv2d() which takes in an argument of filter size: [filter_height, filter_width, in_channel, out_channel]. So if I used the mnist dataset and ran the network on an image displaying the number '5,' would the filter be trained on the lower, circular bowl...
QuarterShotofEspresso
1

votes
1

answer
284

Views

Python Scipy ndimage.gaussian_filter throwing runtime error

What I am trying to do: Read an image in python. Apply Gaussian filter using Scipy's ndimage.gaussian_filter() function. Display the resultant image. Here is the code that I am trying to run: import cv2 from matplotlib import pyplot as plt import scipy.ndimage as ndimage img = cv2.imread('lena.png',...
Trevor Track
1

votes
0

answer
206

Views

Angular loss in tensorflow doesn't decrease

Recently I am implementing the 'FC4: Fully Convolutional Color Constancy with Confidence-weighted Pooling' paper in tensorflow. In the paper, angular loss is defined as 'angular loss = arccos(cosine_distance(vector1, vector2))'. However, I tried several ways to implement this loss function but in tr...
Secret_Wang

View additional questions