# Questions tagged [regression]

3095 questions

0

votes

0

answer

18

Views

### How to combine to combine KNeighborsRegressor with RandomForestRegressor for prediction?

I have done two predictions for some data, one with KNeighborsRegressor, another with RandomForestRegressor, and have scored them.
I would now like to use both models combined to make a prediction.
I have found online that you can do this by using VotingClassifier
Here is the code that I currently...

0

votes

0

answer

2

Views

### How to test the accuracy of COX-PH models in R with ROC curve

I'm working the the lung dataset in the survival package in R. I have split the data into 80% and 20%, and am built the coxph model using 80% of the data. I would like to test the data on the 20% to predict the disease status, what code would achieve this? Is there a function in the survival package...

2

votes

2

answer

26

Views

### Regression model point estimation

I'd like to retrieve the values of a second order polynomial regression line based on a list of values for a parameter.
Here is the model:
fit

0

votes

1

answer

18

Views

2

votes

0

answer

23

Views

### Is it allowed/possible to call an R function or fortran code within a pragma openmp parallel for loop in Rcpp?

In an Rcpp project I would like to be able to either call an R function (the cobs function from the cobs package to do a concave spline fit) or call the fortran code that it relies on (the cobs function uses quantreg's rq.fit.sfnc function to fit the constrained spline model, which in turn relies on...

1

votes

2

answer

362

Views

### Does fitting a sklearn Linear Regression classifier multiple times add data points or just replace them?

X = np.array(df.drop([label], 1))
X_lately = X[-forecast_out:]
X = X[:-forecast_out]
df.dropna(inplace=True)
y = np.array(df[label])
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)
linReg.fit(X_train, y_train)
I've been fitting my linear regression classifie...

0

votes

1

answer

17

Views

3

votes

2

answer

175

Views

### How to add “greater than 0 and sums to 1” constraint to a regression in Python?

I am using statsmodels (open to other python options) to run some linear regression. My problem is that I need the regression to have no intercept and constraint the coefficients in the range (0,1) and also sum to 1.
I tried something like this (for the sum of 1, at least):
from statsmodels.formula....

0

votes

0

answer

7

Views

### Linear regression load model doesn't predict as expected

I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and t...

1

votes

3

answer

58

Views

### How can I use stepwise regression to remove a specific coefficient in logistic regression within R?

When I run the logistic regression for a cars dataset:
carlogistic.fit4 |z|)
(Intercept) -2.697e+01 5.226e+00 -5.161 2.45e-07 ***
Weight -6.006e-03 7.763e-04 -7.737 1.02e-14 ***
Year 5.677e-01 8.440e-02 6.726 1.75e-11 ***
OriginGerman 1.256e+00 5.172e-01 2.428...

0

votes

1

answer

6

Views

### gridSearch performance measure effect

I have an assignment and it asks me to:
Improve the performance of the models from the previous stepwith
hyperparameter tuning and select a final optimal model using grid
search based on a metric (or metrics) that you choose. Choosing an
optimal model for a given task (comparing multiple regressor...

1

votes

1

answer

99

Views

### Publication-style output for 1st stage coefficients and R2 of ivregress

I am running the following in Stata:
eststo: ivregress 2sls y (x=z) control [aw=weight], cluster(cluster) first
esttab using file.tex, b(%9.3f) se(%9.3f) r2(%9.8f) replace
This produces a publication-style table for 2nd stage.
However, what should I do to do that for 1st stage? I need coefficien...

1

votes

2

answer

357

Views

### sklearn Python and Logistic regression

Good night, community!
I have a simple question whose answer may not be as simple:
How can I show the independent variable coefficients of a Logistic regression model using Python's SciKit Learn?

1

votes

1

answer

675

Views

### R predicting from multivariate polynomial models

I have 3 columns of data in a dataframe (data) with no headers.
The 1st and 2nd column are the independent variables and the 3rd column is the dependent variable.
I have to fit a polynomial of order 3 in the independent variables.
I did:
dm

1

votes

1

answer

1.5k

Views

### R: auto.arima() with xreg vs. lm()

I am trying to understand how auto.arima() with linear regression vs. lm() works.
My assumption, which seems to not be true, is that when you use auto.arima() and specifying xreg, that a linear model is fit to the overall series, and then an ARMA model is used to further fit the residuals. I get th...

1

votes

1

answer

4.2k

Views

### Syntax for stepwise logistic regression in r

I am trying to conduct a stepwise logistic regression in r with a dichotomous DV. I have researched the STEP function that uses AIC to select a model, which requires essentially having a NUll and a FULL model. Here's the syntax I've been trying (I have a lot of IVs, but the N is 100,000+):
Full = gl...

1

votes

2

answer

712

Views

### Questions of xgboost with R

I used xgboost to do logistic regression. I followed the steps from, but I got two problems.The datasets are found here.
First, when I run the follow code:
bst

1

votes

1

answer

74

Views

### Linear regression on variables that does not scale directly with the output

I've been trying to follow a machine learning course on coursera. So far, most of the linear regression models introduced use variables that their numerical values have a positive correlation with the output.
Input: square feet of the house
Output: house price.
I'm however, trying to implement a m...

1

votes

1

answer

1.1k

Views

### Explanation of Isotropic Kernel

What is Isotropic Kernel . What are its features . How can we use it in context of non parametric regression like Kernel Regression ? An intuitive explanation followed with metrics will be helpful.

1

votes

1

answer

195

Views

### Pandas Multivariate Linear Regression by Group and Saving Results as csv

I am trying to calculate linear regression of Y=C-A column, x = ['Plate X', 'Plate Y', 'Field X'] and group those values by Drum and Plate. Additional question - how to save results as a file, csv preferable.
Is pandas package is sufficient for this task or other package needed.
Thank you
There is...

1

votes

0

answer

10

Views

### Interpreting OLS Weights after PCA (in Python)

I want to interpret the regression model weights in a model where the input data has been pre-processed with PCA. In reality, I have 100s of input dimensions which are highly correlated, so I know that PCA is useful. However, for the sake of illustration I will use the Iris dataset.
The sklearn code...

1

votes

2

answer

121

Views

### Change cornerpoint in generalized linear model

fit attach(dat)
> levels(x9)
[1] 'ja' 'nej'
> x9 levels(x9)
[1] 'nej' 'ja' ###Changing the level was succesfull
> summary(glm(y~.,family = binomial, data=dat))
Call:
glm(formula = y ~ ., family = binomial, data = dat)
Deviance Residuals:
Min 1Q Median...

1

votes

1

answer

2.2k

Views

### What is target in Python's sklearn coef_ output?

When I do ridge regression using sklearn in Python, the coef_ output gives me a 2D array. According to the documentation it is (n_targets, n_features).
I understand that features are my coefficients. However, I am not sure what targets are. What is this?

1

votes

1

answer

468

Views

### fminsearch for non linear regression Matlab?

Can anyone explain to me how I can apply fminsearch to this equation to find the value of K (Diode Equality Factor) using the Matlab command window.
I = 10^-9(exp(38.68V/k)-1)
I have data values as follows:
Voltage := [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]:
Current:= [0, 0, 0, 0, 0,...

1

votes

1

answer

405

Views

### 10 fold cross validation python single variable regression

How does numpy vstack work in numpy while I am splitting the data into 10 folds.
X_set = np.split(X, 10)
Y_set = np.split(Y, 10)
for i in range(len(X_set)):
X_test= ?
Y_test= ?

1

votes

1

answer

132

Views

### R: Robust linear regression using a list having repeated number

I'm using an rlm model like this.
fit=rlm(log(y) ~ x + z)
Z is a list that contains all 1. I get the error Error in rlm.default(x, y, weights, method = method, wt.method = wt.method, : 'x' is singular: singular fits are not implemented in 'rlm'
Is it equivalent to use fit=rlm(log(y) ~ x + 1) in...

1

votes

1

answer

863

Views

### Apache Spark MLlib LabeledPoint null label issue

I'm trying to run one of MLlib algorithms, namely LogisticRegressionWithLBFGS on my database.
This algorithm takes the training set as LabeledPoint. Since LabeledPoint requires a double label ( LabeledPoint( double label, Vector features) ) and my database contains some null values, how can I solve...

1

votes

3

answer

842

Views

### Test Method Fails in Test Suite but passes individually in .net C#

I am encountering a very strange problem in .net regression testing. I have a test method which fails when I run the complete test suit, but the same test method passes when run individually.
What could be the possible reason behind it. I double checked that other test methods are having no effect...

1

votes

3

answer

93

Views

### trying to print mulitple logistic regressions in statsmodels python

I'm trying to print a series of logistic regressions in statsmodels but am unsure how to print the results to something other than the console screen. I've created a function that runs the regressions where data is the dataset, and the other variables are a series of lists of dummy variable labels f...

1

votes

1

answer

427

Views

### sklearn r2_score and python stats lineregress function give very different values of R^2. Why?

I´m using the same data but different python libraries to calculate the coefficient of determination R^2. Using stats library and sklearn yield different results.
What is the reason behind this behavior?
# Using stats lineregress
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)...

1

votes

1

answer

858

Views

### Extract degrees of freedom in R

I am running a large number of linear regressions, and for each regression I would like to save the adjusted R squared and the degrees of freedom each in a seperate file.
The code below does this perfectly for the adjusted R squared, and I can add the value name of the list to the file (so I can ide...

1

votes

1

answer

344

Views

### Multivariate regression in Matlab

I have been all over Google trying to find a good function/package to perform multivariate regression (i.e. predict multiple continuous variables given another set of multiple continuous variables).
I wish to use something like fitlm(), since that also gives me p-value statistics and R squared stati...

1

votes

1

answer

235

Views

### How to get two random effects crossed with one nested in the other in nlme?

My nonlinear mixed-effects model regresses body mass (bm) on age. I would like consider that brood is nested within year, but as a brood can only occur in one of the seven years that are in the dataset, the random effects of year and brood should be crossed.
In Pinheiro & Bates (2000): ‘Mixed-Effe...

1

votes

1

answer

918

Views

### predict vector values instead of single output

In linear regression I've always seen the situation where I have many features and I use them to predict a single output, for example
f1 f2 f3 f4 --> y1
f1 f2 f3 f4 --> y2
and so on...
I want to know if there is something where the predicted value i.e. y1 is actually a vector not a single value

1

votes

1

answer

97

Views

### Robust regression in scilab

For the aim of a robust linear regression, i want to realize a M-Estimator with Geman-McLure loss function
The class of M-Estimators are presented in this document and Geman-McLure can be found at page 13.
To solve the minimization problem, Iteratively reweighted least squares is recommended. How c...

1

votes

1

answer

108

Views

### Using low frequency data to calibrate high frequency data

I have a 10 Hz time series measured by a fast instrument and a 1 minute time series measured by a slow reference instrument. The data consists of a fluctuating meteorological parameter. The slow reference instrument is used to calibrate the fast instrument measurements. Both time series are synchron...

1

votes

1

answer

208

Views

### Reading csv to array, performing linear regression on array and writing to csv in Python depending on gradient

I am having to tackle a problem that far exceeds my current programming skill for Python. I am having difficulty combining different modules (csv reader, numpy etc.) into a single script. My data contains a large list of weather variables across time (with minute resolution) for many days. My object...

1

votes

1

answer

2k

Views

### mgcv gam() error: model has more coefficients than data

I am using GAM (generalized additive models) for my dataset. This dataset has 32 observations, with 6 predictor variables and a response variable (namely power).
I am using gam() function of the mgcv package to fit the models. Whenever, I try to fit a model I do get an error message as:
Error in ga...

1

votes

1

answer

83

Views

### How do I obtain slopes on a prediction model that's a curve? And save them as a table

So here’s the game plan. I’m trying to take this data set (will be a structure object) below, run a curved regression model through it.
Then, I’d like to take the slope (i.e. the first derivative value for each x) at each point, and save the data table with that slope information in its own co...

1

votes

1

answer

415

Views

### Issue with Scikit Learn Package for SVR Regression

I am trying to fit a SVM regression model using Scikit Learn Package but it is not working like I am expecting.
Could you please help me to find the error? The code that I would like to use is:
from sklearn.svm import SVR
import numpy as np
X = []
x = np.arange(0, 20)
y = [3, 4, 8, 4, 6, 9, 8, 12,...