Questions tagged [regression]

0

votes
0

answer
18

Views

How to combine to combine KNeighborsRegressor with RandomForestRegressor for prediction?

I have done two predictions for some data, one with KNeighborsRegressor, another with RandomForestRegressor, and have scored them. I would now like to use both models combined to make a prediction. I have found online that you can do this by using VotingClassifier Here is the code that I currently...
Luka Vlaskalic
0

votes
0

answer
2

Views

How to test the accuracy of COX-PH models in R with ROC curve

I'm working the the lung dataset in the survival package in R. I have split the data into 80% and 20%, and am built the coxph model using 80% of the data. I would like to test the data on the 20% to predict the disease status, what code would achieve this? Is there a function in the survival package...
Beum
2

votes
2

answer
26

Views

Regression model point estimation

I'd like to retrieve the values of a second order polynomial regression line based on a list of values for a parameter. Here is the model: fit
statsguyz
0

votes
1

answer
18

Views
2

votes
0

answer
23

Views

Is it allowed/possible to call an R function or fortran code within a pragma openmp parallel for loop in Rcpp?

In an Rcpp project I would like to be able to either call an R function (the cobs function from the cobs package to do a concave spline fit) or call the fortran code that it relies on (the cobs function uses quantreg's rq.fit.sfnc function to fit the constrained spline model, which in turn relies on...
Tom Wenseleers
1

votes
2

answer
362

Views

Does fitting a sklearn Linear Regression classifier multiple times add data points or just replace them?

X = np.array(df.drop([label], 1)) X_lately = X[-forecast_out:] X = X[:-forecast_out] df.dropna(inplace=True) y = np.array(df[label]) X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2) linReg.fit(X_train, y_train) I've been fitting my linear regression classifie...
Carrot Slat
3

votes
2

answer
175

Views

How to add “greater than 0 and sums to 1” constraint to a regression in Python?

I am using statsmodels (open to other python options) to run some linear regression. My problem is that I need the regression to have no intercept and constraint the coefficients in the range (0,1) and also sum to 1. I tried something like this (for the sum of 1, at least): from statsmodels.formula....
amaatouq
0

votes
0

answer
7

Views

Linear regression load model doesn't predict as expected

I have trained a linear regression model, with sklearn, for a 5 star rating and it's good enough. I have used Doc2vec to create my vectors, and saved that model. Then I save the linear regression model to another file. What I'm trying to do is load the Doc2vec model and linear regression model and t...
Marilou
1

votes
3

answer
58

Views

How can I use stepwise regression to remove a specific coefficient in logistic regression within R?

When I run the logistic regression for a cars dataset: carlogistic.fit4 |z|) (Intercept) -2.697e+01 5.226e+00 -5.161 2.45e-07 *** Weight -6.006e-03 7.763e-04 -7.737 1.02e-14 *** Year 5.677e-01 8.440e-02 6.726 1.75e-11 *** OriginGerman 1.256e+00 5.172e-01 2.428...
user10001876
0

votes
1

answer
6

Views

gridSearch performance measure effect

I have an assignment and it asks me to: Improve the performance of the models from the previous stepwith hyperparameter tuning and select a final optimal model using grid search based on a metric (or metrics) that you choose. Choosing an optimal model for a given task (comparing multiple regressor...
CFD
1

votes
1

answer
99

Views

Publication-style output for 1st stage coefficients and R2 of ivregress

I am running the following in Stata: eststo: ivregress 2sls y (x=z) control [aw=weight], cluster(cluster) first esttab using file.tex, b(%9.3f) se(%9.3f) r2(%9.8f) replace This produces a publication-style table for 2nd stage. However, what should I do to do that for 1st stage? I need coefficien...
user42459
1

votes
2

answer
357

Views

sklearn Python and Logistic regression

Good night, community! I have a simple question whose answer may not be as simple: How can I show the independent variable coefficients of a Logistic regression model using Python's SciKit Learn?
StillBuggin
1

votes
1

answer
675

Views

R predicting from multivariate polynomial models

I have 3 columns of data in a dataframe (data) with no headers. The 1st and 2nd column are the independent variables and the 3rd column is the dependent variable. I have to fit a polynomial of order 3 in the independent variables. I did: dm
Artemis Fowl
1

votes
1

answer
1.5k

Views

R: auto.arima() with xreg vs. lm()

I am trying to understand how auto.arima() with linear regression vs. lm() works. My assumption, which seems to not be true, is that when you use auto.arima() and specifying xreg, that a linear model is fit to the overall series, and then an ARMA model is used to further fit the residuals. I get th...
mpettis
1

votes
1

answer
4.2k

Views

Syntax for stepwise logistic regression in r

I am trying to conduct a stepwise logistic regression in r with a dichotomous DV. I have researched the STEP function that uses AIC to select a model, which requires essentially having a NUll and a FULL model. Here's the syntax I've been trying (I have a lot of IVs, but the N is 100,000+): Full = gl...
Carter
1

votes
2

answer
712

Views

Questions of xgboost with R

I used xgboost to do logistic regression. I followed the steps from, but I got two problems.The datasets are found here. First, when I run the follow code: bst
liu66
1

votes
1

answer
74

Views

Linear regression on variables that does not scale directly with the output

I've been trying to follow a machine learning course on coursera. So far, most of the linear regression models introduced use variables that their numerical values have a positive correlation with the output. Input: square feet of the house Output: house price. I'm however, trying to implement a m...
Mc Kevin
1

votes
1

answer
1.1k

Views

Explanation of Isotropic Kernel

What is Isotropic Kernel . What are its features . How can we use it in context of non parametric regression like Kernel Regression ? An intuitive explanation followed with metrics will be helpful.
shan
1

votes
1

answer
195

Views

Pandas Multivariate Linear Regression by Group and Saving Results as csv

I am trying to calculate linear regression of Y=C-A column, x = ['Plate X', 'Plate Y', 'Field X'] and group those values by Drum and Plate. Additional question - how to save results as a file, csv preferable. Is pandas package is sufficient for this task or other package needed. Thank you There is...
Felix
1

votes
0

answer
10

Views

Interpreting OLS Weights after PCA (in Python)

I want to interpret the regression model weights in a model where the input data has been pre-processed with PCA. In reality, I have 100s of input dimensions which are highly correlated, so I know that PCA is useful. However, for the sake of illustration I will use the Iris dataset. The sklearn code...
Tristan Fletcher
1

votes
2

answer
121

Views

Change cornerpoint in generalized linear model

fit attach(dat) > levels(x9) [1] 'ja' 'nej' > x9 levels(x9) [1] 'nej' 'ja' ###Changing the level was succesfull > summary(glm(y~.,family = binomial, data=dat)) Call: glm(formula = y ~ ., family = binomial, data = dat) Deviance Residuals: Min 1Q Median...
econmajorr
1

votes
1

answer
2.2k

Views

What is target in Python's sklearn coef_ output?

When I do ridge regression using sklearn in Python, the coef_ output gives me a 2D array. According to the documentation it is (n_targets, n_features). I understand that features are my coefficients. However, I am not sure what targets are. What is this?
Ashish
1

votes
1

answer
468

Views

fminsearch for non linear regression Matlab?

Can anyone explain to me how I can apply fminsearch to this equation to find the value of K (Diode Equality Factor) using the Matlab command window. I = 10^-9(exp(38.68V/k)-1) I have data values as follows: Voltage := [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]: Current:= [0, 0, 0, 0, 0,...
Saavin
1

votes
1

answer
405

Views

10 fold cross validation python single variable regression

How does numpy vstack work in numpy while I am splitting the data into 10 folds. X_set = np.split(X, 10) Y_set = np.split(Y, 10) for i in range(len(X_set)): X_test= ? Y_test= ?
Sahith
1

votes
1

answer
132

Views

R: Robust linear regression using a list having repeated number

I'm using an rlm model like this. fit=rlm(log(y) ~ x + z) Z is a list that contains all 1. I get the error Error in rlm.default(x, y, weights, method = method, wt.method = wt.method, : 'x' is singular: singular fits are not implemented in 'rlm' Is it equivalent to use fit=rlm(log(y) ~ x + 1) in...
zinon
1

votes
1

answer
863

Views

Apache Spark MLlib LabeledPoint null label issue

I'm trying to run one of MLlib algorithms, namely LogisticRegressionWithLBFGS on my database. This algorithm takes the training set as LabeledPoint. Since LabeledPoint requires a double label ( LabeledPoint( double label, Vector features) ) and my database contains some null values, how can I solve...
Merve Bozo
1

votes
3

answer
842

Views

Test Method Fails in Test Suite but passes individually in .net C#

I am encountering a very strange problem in .net regression testing. I have a test method which fails when I run the complete test suit, but the same test method passes when run individually. What could be the possible reason behind it. I double checked that other test methods are having no effect...
Madhur Maurya
1

votes
3

answer
93

Views

trying to print mulitple logistic regressions in statsmodels python

I'm trying to print a series of logistic regressions in statsmodels but am unsure how to print the results to something other than the console screen. I've created a function that runs the regressions where data is the dataset, and the other variables are a series of lists of dummy variable labels f...
MikeD
1

votes
1

answer
427

Views

sklearn r2_score and python stats lineregress function give very different values of R^2. Why?

I´m using the same data but different python libraries to calculate the coefficient of determination R^2. Using stats library and sklearn yield different results. What is the reason behind this behavior? # Using stats lineregress slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)...
Pablo Fleurquin
1

votes
1

answer
858

Views

Extract degrees of freedom in R

I am running a large number of linear regressions, and for each regression I would like to save the adjusted R squared and the degrees of freedom each in a seperate file. The code below does this perfectly for the adjusted R squared, and I can add the value name of the list to the file (so I can ide...
research111
1

votes
1

answer
344

Views

Multivariate regression in Matlab

I have been all over Google trying to find a good function/package to perform multivariate regression (i.e. predict multiple continuous variables given another set of multiple continuous variables). I wish to use something like fitlm(), since that also gives me p-value statistics and R squared stati...
1

votes
1

answer
235

Views

How to get two random effects crossed with one nested in the other in nlme?

My nonlinear mixed-effects model regresses body mass (bm) on age. I would like consider that brood is nested within year, but as a brood can only occur in one of the seven years that are in the dataset, the random effects of year and brood should be crossed. In Pinheiro & Bates (2000): ‘Mixed-Effe...
ebcs
1

votes
1

answer
918

Views

predict vector values instead of single output

In linear regression I've always seen the situation where I have many features and I use them to predict a single output, for example f1 f2 f3 f4 --> y1 f1 f2 f3 f4 --> y2 and so on... I want to know if there is something where the predicted value i.e. y1 is actually a vector not a single value
fady taher
1

votes
1

answer
97

Views

Robust regression in scilab

For the aim of a robust linear regression, i want to realize a M-Estimator with Geman-McLure loss function The class of M-Estimators are presented in this document and Geman-McLure can be found at page 13. To solve the minimization problem, Iteratively reweighted least squares is recommended. How c...
peng
1

votes
1

answer
108

Views

Using low frequency data to calibrate high frequency data

I have a 10 Hz time series measured by a fast instrument and a 1 minute time series measured by a slow reference instrument. The data consists of a fluctuating meteorological parameter. The slow reference instrument is used to calibrate the fast instrument measurements. Both time series are synchron...
Buzz
1

votes
1

answer
208

Views

Reading csv to array, performing linear regression on array and writing to csv in Python depending on gradient

I am having to tackle a problem that far exceeds my current programming skill for Python. I am having difficulty combining different modules (csv reader, numpy etc.) into a single script. My data contains a large list of weather variables across time (with minute resolution) for many days. My object...
Joss Kirk
1

votes
1

answer
2k

Views

mgcv gam() error: model has more coefficients than data

I am using GAM (generalized additive models) for my dataset. This dataset has 32 observations, with 6 predictor variables and a response variable (namely power). I am using gam() function of the mgcv package to fit the models. Whenever, I try to fit a model I do get an error message as: Error in ga...
Haroon Rashid
1

votes
1

answer
83

Views

How do I obtain slopes on a prediction model that's a curve? And save them as a table

So here’s the game plan. I’m trying to take this data set (will be a structure object) below, run a curved regression model through it. Then, I’d like to take the slope (i.e. the first derivative value for each x) at each point, and save the data table with that slope information in its own co...
Tom
1

votes
1

answer
415

Views

Issue with Scikit Learn Package for SVR Regression

I am trying to fit a SVM regression model using Scikit Learn Package but it is not working like I am expecting. Could you please help me to find the error? The code that I would like to use is: from sklearn.svm import SVR import numpy as np X = [] x = np.arange(0, 20) y = [3, 4, 8, 4, 6, 9, 8, 12,...
David Hoareau

View additional questions