Highest Voted 'statsmodels' Questions - Statistical Analysis Stack Exchange

56

votes

3 answers

Logistic Regression: Scikit Learn vs Statsmodels

I am trying to understand why the output from logistic regression of these two libraries gives different results. I am using the dataset from UCLA idre tutorial, predicting admit based on gre, gpa and rank. rank is treated as categorical variable,…

asked Mar 25 '16 at 22:01

hurrikale

853
1
8
7

53

votes

2 answers

Pandas / Statsmodel / Scikit-learn

Are Pandas, Statsmodels and Scikit-learn different implementations of machine learning/statistical operations, or are these complementary to one another? Which of these has the most comprehensive functionality? Which one is actively developed…

machine-learning python scikit-learn statsmodels pandas

asked Jan 17 '13 at 01:02

Nik

1,279
2
13
19

25

votes

3 answers

Analyse ACF and PACF plots

I want to see if I am on the right track analysing my ACF and PACF plots: Background: (Reff: Philip Hans Franses, 1998) As both ACF and PACF show significant values, I assume that an ARMA-model will serve my needs The ACF can be used to estimate…

time-series model-selection arma statsmodels

asked Jan 22 '15 at 12:59

Peter Knutsen

367
1
3
8

23

votes

4 answers

Difference between statsmodel OLS and scikit linear regression

I have a question about two different methods from different libraries which seems doing same job. I am trying to make linear regression model. Here is the code which I using statsmodel library with OLS : X_train, X_test, y_train, y_test =…

regression python scikit-learn statsmodels

asked Apr 16 '15 at 23:11

Batuhan B

573
2
5
13

19

votes

2 answers

Ordinal logistic regression in Python

I would like to run an ordinal logistic regression in Python - for a response variable with three levels and with a few explanatory factors. The statsmodels package supports binary logit and multinomial logit (MNLogit) models, but not ordered logit.…

categorical-data python logit ordered-logit statsmodels

asked Aug 21 '15 at 19:39

Hadi

199
1
1
4

14

votes

2 answers

Logistic regression with binomial data in Python

This is probably trivial but I couldn't figure it out. I want to fit a logistic regression model, where my dependent variable is not a Bernoulli variable, but a binomial count. Namely, for each $X_i$, I have $s_i$, the number of successes, and…

regression logistic python statsmodels

asked Apr 19 '16 at 16:49

R S

507
1
5
15

13

votes

4 answers

Statsmodels says ARIMA is not appropriate because series is not stationary, how is it testing that?

I have a time series that I am trying to model with Python's statsmodels ARIMA api. When I apply the following: from statsmodels.tsa.arima_model import ARIMA model = ARIMA(data['Sales difference'].dropna(), order=(2, 1, 2)) results_AR =…

time-series forecasting arima statsmodels

asked Dec 14 '16 at 03:37

Skander H.

10,602
2
33
81

13

votes

1 answer

Cause of a high condition number in a python statsmodels regression?

I'm pretty new to regression analysis, and I'm using python's statsmodels to look at the relationship between GDP/health/social services spending and health outcomes (DALYs) across the OECD. Just to give an idea of the data I'm using, this is a…

regression python statsmodels condition-number

asked Oct 28 '16 at 18:44

pst0102

131
1
1
5

12

votes

2 answers

The identity link function does not respect the domain of the Gamma family?

I am using using a gamma generalized linear model (GLM) with an identity link. The independent variable is the compensation of a particular group. Python's statsmodels summary is giving me a warning about the identity link function ("DomainWarning:…

generalized-linear-model python gamma-distribution statsmodels

asked Jul 13 '18 at 19:39

kalidurge

314
3
10

12

votes

1 answer

Why does statsmodels.api.OLS over-report the r-squared value?

I am using statsmodels.api.OLS to fit a linear regression model with 4 input-features. The shape of the data is: X_train.shape, y_train.shape Out[]: ((350, 4), (350,)) Then I fit the model and compute the r-squared value in 3 different…

multiple-regression scikit-learn r-squared statsmodels

asked Mar 14 '17 at 07:12

dhrumeel

281
1
2
8

10

votes

1 answer

Why is forecasting of ARMA models performed by Kalman filter

What are the advantages of expressing an ARMA model as a state-space-model and do forecasting using a Kalman filter? This methodology is for example used in the SARIMAX implementation of…

forecasting arma kalman-filter state-space-models statsmodels

asked Jul 15 '15 at 12:13

user3429986

317
1
6

10

votes

1 answer

Assessing the Contribution of each Predictor in Linear Regression

Say I build a linear regression model to identify linear dependencies between variables in my data. Some of these variables are categorical variables. If I want to evaluate the contribution of a given predictor, how do I evaluate it? Can I compare…

regression multicollinearity statsmodels

asked Sep 04 '14 at 19:09

Amelio Vazquez-Reina

17,546
26
74
110

9

votes

2 answers

Dummy/baseline models for time series forecasting

I am working on an evaluation of time series forecasting models in Python, more specifically with statsmodels, scikit-learn and tensorflow. I think it makes sense to first compare the model performance to a set of "trivial" models. What are examples…

time-series forecasting python scikit-learn statsmodels

asked Apr 25 '19 at 11:55

clstaudt

243
1
6

9

votes

2 answers

Can we say 50% of data will be between 25th-75th percentile?

Let's say we have the following dataframe: TY_MAX 141 1.004622 142 1.004645 143 1.004660 144 1.004672 145 1.004773 146 1.004820 147 1.004814 148 1.004807 149 1.004773 150 1.004820 151 1.004814 152 1.004834 153 1.005117 154 …

quantiles statsmodels

asked Jul 31 '18 at 12:52

Don Coder

435
4
10

9

votes

1 answer

Proving similarities of two time series

Let's assume an analytical model predicts an epidemic trend over time, i.e. number of infections over time. We also have a computer simulation results over time to verify the performance of the model. The goal is to prove the simulation results and…

r time-series arima granger-causality statsmodels

asked Sep 12 '15 at 20:06

Moe

91
1
1
3

Questions tagged [statsmodels]