# Questions tagged [sample-size]

11 questions

0

28

Views

### MANOVA with variables from different datasets

This question was already asked on stats.stackexchange, but no one answered. Since I'm not sure which forum is the appropriate one, I post this here again with some data. I have done experiments on various characteristics of tree bark and now want to compare in how far the five examined tree species...
bamphe

1

135

Views

### samplesize package in R, understanding the parameters

Small Disclaimer: I considered posting this on cross-validated, but I feel that this is more related to a software implementation. The question can be migrated if you disagree. I am trying out the package samplesize. I am trying to decipher what the k parameter for the function n.ttest is. The follo...
Gumeo

1

0

Views

### SMOTE in r reducing sample size significantly

I have a data set with around 130000 records. The records divided in two class of target variable,0 & 1. 1 contains only 0.09% of total proportion. I'm running my analysis in R-3.5.1 on Windows 10. I used SMOTE algorithm to work with this imbalanced data set. I used following code to handle imbalanc...
Sonia

1

1.9k

Views

### Is there a good way to display sample size on grouped boxplots using Python Matplotlib

I could get the size info using groupby and add text to the corresponding location. But I can't help thinking there's a better way as this really seems mundane, something many people would like to see... To illustrate, the following code would generate a grouped boxplot import pandas as pd df = pd.D...
Tian He

1

395

Views

### pwr.chisq.test error in R

I am now trying to estimate the sample size needed for A/B testing of website conversion rate. pwr.chisq.test always gives me error message, when I have small value of conversion rate: # conversion rate for two groups p1 = 0.001 p2 = 0.0011 # degree of freedom df = 1 # effect size w = ES.w1(p1,p2) p...
Peter Pan

1

64

Views

### Simulating thousands of regressions and obtaining p-values

I'm looking to do some basic simulation in R to examine the nature of p-values. My goal is to see whether large sample sizes trend towards small p-values. My thought is to generate random vectors of 1,000,000 data points, regress them on each other, and then plot the distribution of p-values and loo...
macworthy

1

5.8k

Views

### Sample size and power calculation in r as viable alternative to proc power in SAS?

So I am trying to see how close the sample size calculations (for two sample independent proportions with unequal samples sizes) are between proc power in SAS and some sample size functions in r. I am using the data found here at a UCLA website. The UCLA site gives parameters as follows: p1=.3,p2=...
user27008

1

507

Views

### Optimizing for global minimum

I am attempting to use optimize() to find the minimum value of n for the following function (Clopper-Pearson lower bound): f
a.powell

1

182

Views

### Stratified Bootstrapping in R with >25 strata

I have data with about 25 different groups. In an effort to see how the variance of each group would change if I had different sample sizes I am trying to do stratified bootstraping. For example at sample size 5, it should produce 1000 collections of 5 resampled points for each group. I like to coll...
andemexoax

1

8.4k

Views

### Minimum number of observation when performing Random Forest

Is it possible to apply RandomForests to very small datasets? I have a dataset with many variables but only 25 observation each. Random forests produce reasonable results with low OOB errors (10-25%). Is there any rule of thumb regarding the minimum number of observations to use? In fact one of the...
Oritteropus

2