How to read multiple linear regression coefficients for dummy variables in Python?

Question

I'm fairly certain this will be quickly answered but I couldn't find anything that addressed my specific use case.

I'm running a multivariate linear regression using scikit-learn. I created dummies for two variables (is_listed and paid_vs_public) and have one variable that is a numeric value. I am trying to find the relation to the y, called fill-rate.

Here's the output I get below:

[('publish_event_start_delta',0.017666929441105143),
('is_listed', -0.36775097784982064),
('event_paid_type', -2.9127009412292346)]

How do I interpret the event_paid_type coefficient when I've encoded it as :

final_df['event_paid_type'] = final_df.event_paid_type.map({'free event':0, 'paid event':1})

Is it multivariate or multiple regression, i.e. many dependent variables or just one dependent variable with many covariates? — Richard Hardy, Jun 14 '17 at 18:26
This question isn't about how to *code* dummies in Python, but how to interpret regression output (from Python). As such, it is on topic here. It is, however, a duplicate. — gung - Reinstate Monica, Jun 15 '17 at 13:13

score 1 · Answer 1 · answered Jun 14 '17 at 21:01

The interpretation of the coefficients in the multiple regression is as follows:

Given: $y = 1 + 10x_{1} + 2x_{2}$

Interpretation: If $x_{2}$ is fixed, then for each change of 1 unit in $x_{1}$, $y$ changes by 10 units.

Lets say that publish_event_start_delta and is_listed are fixed and we will vary only event_paid_type. event_paid_type can take two values: either 0 or 1.

y = publish_event_start_delta * 0.0176 + is_listed * -0.3677 + event_paid_type * -2.912

If event_paid_type = 0 your equation will be:

y = publish_event_start_delta * 0.0176 + is_listed * -0.3677

If event_paid_type = 1 you get

y = publish_event_start_delta * 0.0176 + is_listed * -0.3677 + (1 * -2.912)

That means that whenever you have a paid event (and you fix other variables), your $y$ is decreased by 2.912 units.

How to read multiple linear regression coefficients for dummy variables in Python?

1 Answers1