Questions tagged [statistics]

1

votes
1

answer
7.4k

Views
1

votes
1

answer
329

Views

Why does variogram always plots 15 points in R?

I would like to plot a variable number of points as my sample size increases. However, for some reason the "variogram" function only plots 15 points every time. I checked to make sure that the size of the data I'm passing "variogram" was varying correctly - it was. library(gstat) library(RandomFiel...
Candic3
1

votes
1

answer
25

Views

How can I get similar distribution from different groups?

I've to find in the dataset subgroups with similar average for 2 metrics than my original group. For example, I'd like to find a city or group of cities with the closest average(metric 1) = 10 and average(metric 2) = 5. Dataset example: How can I do it?
gabriel.almeida
-1

votes
0

answer
15

Views

How to fix this function?

I am doing an exercise to create a function. One of the questions is: "We can estimate the cumulative risk of an certain event using the exponential formula 1-exp(-1/10000*t) where t is the time to the event. Create a function ans(t), which returns the risk at time t. and I am using this command:...
hasan sohail
1

votes
0

answer
14

Views

why exponential smoothing returns a error for leap year in python?

here is my code -: alpha=0.01 beta=0.9 gamma=0.6 trend= "additive" seasonal="additive" period=364 fit= Exponential_method(period,alpha,beta,gamma,trend,seasonal) here is my length of data is 366 because of leap year and i am doing forecast on daily data here period =364 gives a error (Why it is giv...
rohan
-1

votes
0

answer
3

Views

Quantifying and comparing vendor provided data across multiple data vendors

Layman and first time poster here. Apologies and thanks in advance. I'm looking to test the accuracy of vendors I'm looking to hire for data collection. So far my list is narrowed to eight different providers, each delivering two to six columns of data per record using the same standard 100 record...
OpenWindow REI
0

votes
0

answer
5

Views

How to iterate and increase a counter in SPSS?

I want to do count educational advancement in my dataset in SPSS. I have some programming experience, but I am stuck with the syntax. I have a variable my_education. I want to iteratively compare my_education with education_father and education_mother. If my_education is bigger than that of my paren...
Radu Chirovici
0

votes
0

answer
27

Views

Plot the difference between two lists of values in matplotlib

I have two datetime based lists. I want to plot the difference between their values. The problem is, the lists are of different lengths / resolutions. For example: list 1 is a list of readings taken every minute throughout the day. list 2 is a list of readings taken randomly throughout the day. I c...
Terence Eden
1

votes
2

answer
257

Views

Calculate pairwise spearman's rank correlation from data present in all files in a directory

I'm trying to calculate Spearman's rank correlation, where the data (tsv with name and rank) for each experiment is stored in separate files in a directory. Following is the format of input files: #header not present #geneName value ENSMUSG00000026179.14 14.5648627685587 ENSMUSG00000026179.14...
Siddharth
1

votes
1

answer
22

Views

generate a special matrix (max value of column sum is minimum) with given number of column from a vector

Recently I come across such as a question: given a vector, one need generate a special matrix with given number of column. It should be pointed out that if the elements in the vector is not enough to fill in the generated matrix, then put 0 in the last row in the generated matrix. For the generated...
Kevin
1

votes
0

answer
13

Views

SMOTE in r reducing sample size significantly

I have a data set with around 130000 records. The records divided in two class of target variable,0 & 1. 1 contains only 0.09% of total proportion. I'm running my analysis in R-3.5.1 on Windows 10. I used SMOTE algorithm to work with this imbalanced data set. I used following code to handle imbalanc...
Sonia
0

votes
0

answer
5

Views

How the Naive Bayes works

I already read about the naive bayes that it is a classification technique algorithm and can make predication based on the data you give, but in this example I just cant get it how the output [3,4] came. Following the example: #assigning predictor and target variables x= np.array([[-3,7],[1,5], [1,2...
Mizlul
1

votes
0

answer
12

Views

Split-normal distribution

What's the best way to compute a split-normal distribution given a mean value with an upper and lower error? So far I have: from random import choice, gauss def random_split_normal(mu: float, upper_sigma: float, lower_sigma:int) -> float: return abs(gauss(0.0, 1.0)) * choice([upper_sigma, -lower_sig...
Alex J. R. Lewis
1

votes
2

answer
2.1k

Views

What is a good way to compare similarity between datasets with little variance?

Let's say I have a list of 100 MLB pitchers and 5 statistics for each of them. The difference between, for example, an ERA of 3.5 and 3.1 might not look like a lot to a naive similarity algorithm, but is a lot in baseball. Given that a lot of the player statistics that I'm looking at have this littl...
Carson
0

votes
0

answer
6

Views

How to solve this statistical (standard deviation) problem?

Problem: Your data set has missing values. Further examination tells you that they are spread along 1.5 standard deviation from the median with distribution mean = 0 & variance = 5. How much data would remain unaffected (tell us the %)? Why?
S.Chauhan
1

votes
1

answer
1.6k

Views

R: Multiple Linear Regression with a specific range of variables [duplicate]

This question already has an answer here: short formula call for many variables when building a model [duplicate] 2 answers It appears simple, but I don't know how to code it in R. I have a dataframe (df) with ~100 variables, and I would like to do a multiple regression between the response which i...
Darwin PC
0

votes
0

answer
8

Views

Can someone confirm if I am running this generalized linear model in R correctly?

I'm a grad student and stats beginner just trying to make sure I'm using the right model and using it correctly. I'm using R version 3.5.0. My data look like this: Example Data I have multiple BCI data points for each nest and 5 treatment groups. I want to know if there are differences in BCI betwe...
Heather Smith
1

votes
2

answer
6.7k

Views

Matlab Plotting Normal Distribution Probability Density Function

I am new to statistics. I have a discriminant function:   g(x) = ln p(x| w)+ lnP(w) I know it has a normal distribution. I know mü and sigma variables. How can I plot pdf function of it at Matlab? Here is a conversation: How to draw probability density function in MatLab? however I don't want to...
kamaci
1

votes
1

answer
950

Views

python statsmodels.tsa.stattools.pacf with masked array?

Is there a general trick to using masked arrays (or arrays containing nan's) with the statsmodels routines? For example pacf and acf?
mathtick
1

votes
2

answer
98

Views

How to perform statistical computations in a query?

I have a table which is filled with float values. I need to calculate the number of results grouped by their distribution around the mean value (Gaussian Distribution). Basically, it is calculated like this: SELECT COUNT(*), FloatColumn - AVG(FloatColumn) - STDEV(FloatColumn) FROM Data GROUP BY Fl...
iMan Biglari
1

votes
1

answer
186

Views

Creating a line from the t table using simulation (in R)

How would I go about creating a line from the t-table in R, after running a simulation for a t distribution? In essence, I want to perform the qt function using only values calculated from a random sample from the normal distribution, rather than using the confidence levels as inputs. I have run a s...
ShortMyCDS
1

votes
1

answer
2k

Views

Python SciPy chisquare test returns different p value from Excel and LibreOffice

After reading a recent blog post about an application of the Poisson distribution, I tried reproducing its findings using Python's 'scipy.stats' module, as well as Excel/LibreOffice 'POISSON' and 'CHITEST' functions. For the expected values shown in the article, I simply used: import scipy.stats for...
ttsiodras
1

votes
2

answer
1.7k

Views

Scale parameter in the logit model

While going thorough the logit model notes, I came across something called "scale parameter" in the likelihood. Can someone please explain what that is and what it is used for. What would happen it is not used. Also, is it used in the probit model too? Cheers
Kazo
1

votes
2

answer
421

Views

C/C++ How to calculate the streakedness of numerical data sets?

Would anyone know how to use C/C++ to calculate the streakedness of data? The definition of streakedness is how many deviations away from the mean(i.e running average a numerical data streak. Thank you for your help. [EDIT] From our company's chief software architect, here is the requirement for a s...
Frank
1

votes
1

answer
105

Views

Fitting a linear model

I have a data frame that looks like > t Institution Subject Class ML1 ML1SD aPhysics0 A Physics 0 0.8730469 0.3329205 aPhysics1 A Physics 1 0.8471074 0.3598839 aPhysics2 A Physics 2 0.8593750 0.3476343 aPhysics3 A Physics 3 0.8875000...
bountiful
1

votes
2

answer
240

Views

CSS3 browsers compatibility throught years

I'm trying to find a study or chart which will show the percentage of support for CSS3 in different browsers and versions. I'm looking for it for 2 hours, but the only thing I find is support for CSS3 individual parts but not the whole CSS3. Could you help me with this?
Zbyněk Nedoma
1

votes
3

answer
1.8k

Views

Generating a mixture of binomial distributions

I want to generate a mixture of binomial distribution. Why I need it is because I want to have a normal discrete mixture of gaussian distributions. Is there any scipy library available for it or can you please guide me for the algorithm. I know in general for predefined distributions one can use ppf...
Cupitor
1

votes
1

answer
1.4k

Views

Calculate autocorrelation with lag u in R

Hi I tried calculating autocorrelation with lag u, u = 1...9 I expect 9x1 autocorrelation functions. However when I try to use this code it always gave me 10x1 autocorrelation function with the first term = 1. I am not sure how to proceed. # initialize a vector to store autocovariance maxlag
Grapy
1

votes
1

answer
127

Views

Standardize not among columns, but small parts of columns, using R

I have a multilevel structure, and what I need to do is standardize for each individual (which is the higher level unit, each having several separate measures). Consider: ID measure score 1 1 1 5 2 1 2 7 3 1 3 3 4 2 1 10 5 2 2 5 6 2 3...
PascalVKooten
1

votes
1

answer
41

Views

How to execute the version of R which installed in a local folder?

I unpacked the new version of R package and inside a folder I gave commands: ./configure make Now I want to run it, if I give command: $ R Then it runs the older version. and I have no privilege to deal with it. so I want to run the new installed version. any help? Perhaps it needs to be exported bu...
stephan
1

votes
3

answer
114

Views

How to determine if a current set of data values represent or relate to previous historic data values?

I am trying to develop an method to identify browsing pattern of a user on the basis of page requests. In a simple example I have created 8 pages and for each page request from the user to the page I have stored that page's request frequency in the database as you can see below: Now, my hypothesis...
Akina91
1

votes
4

answer
500

Views

iOS library to detect app stats

Is there any iOS library which detects various user stats within the app like time spent on a view, number of times app was activated etc.? Any suggestions will be most welcome. Thanks.
prabal
1

votes
1

answer
626

Views

Runing R code on `python` with SyntaxError: keyword can't be an expression error Message

I'm looking to run some R code on python I already installed the R package robustbase on ubunto using apt-get install r-cran-robustbase and rpy packege as well. from the python console I can successfully run from rpy import * and r.library("robustbase") but when I run result = robjects.FloatVector...
mongotop
1

votes
2

answer
245

Views

Finding white pixels on monitor in camera image

I have a camera pointed at a monitor displaying a line of white pixels. I get an array of byte values back from the camera. The area of the camera's view is larger than the space taken up by the monitor. I need to find out where on the camera image the white monitor pixels appear. See the sample im...
Alex Wade
1

votes
1

answer
603

Views

Generating random numbers from various distributions in CUDA

I am playing around with doing MCMC on the GPU, and need implementations for various samplers, written for CUDA. Most of the posts I've seen on StackOverflow relate to uniform, binomial, and normal sampling. Are there any libraries that allow me the simplicity and variety of the d-p-q-r functions i...
1

votes
1

answer
282

Views

PHP Random Number Generation Issue

10,000 Loops Range 0-1 Base Average: 0.5 Base Standard Deviation: 0.288675134595 ======================================= mt_rand() Average: 0.337839939116 Standard Deviation: 0.264176807272 --- hexdec(sha1(*GUID*)) Average: 0.37834 Standard Deviation: 0.284251515902
TSUK
1

votes
1

answer
194

Views

Two lines on a line graph with non proportional values

I am trying to get a googlecharts line graph to show me two line graphs with a Y axis of date and an x axis of total amount of substance used. It will be a line graph comparing, for example the total amount of alcohol consumed to tobacco consumed in total per each day. The area i'm struggling with...
user2444539
0

votes
0

answer
5

Views

Model selection function

i am trying to create a new function which is choosing the best model.. First, if the data has no X variable function will automatically choose arima best model using auto.arima function. Second i have models, if there is a one X variable function will choose the best from candidate models. Third if...
Nina M
1

votes
1

answer
5.6k

Views

R: converting non-stationary to stationary

I have one data it is not stationary. I'm trying to make it stationary. I tried log transformation, BoxCox transformation, lag(1, 2 and 3) differences. No use of these transformations and differencing. I used adf test to test stationarity in R. Can anybody tell is there any other method to make it s...
Punith
1

votes
2

answer
1.2k

Views

How do I calculate popularity of content?

I'm developing a web site where the user rates content (1-5 stars). I need to measure the popularity of the content (also referred to as importance/hotness/interest). My first thought was just to add the user ratings for a content: Popularity = SUM(Rating - 2.5) If two users gives it 5-stars and on...
Pking

View additional questions