0

I am participating in a Kaggle competition and I would like to know what is the best way to handle missing attribute values in test data set. For example, if the train data set contains the attributes (x, y, z, prediction) and the test data set contains (x, y), how do I predict "prediction"? Should I just take (x, y) parameters as features or try and predict the third(z) attribute. If the latter holds, how should I go about doing that?
EDIT: Adding the link to the competition https://www.kaggle.com/talkingdata-adtracking-fraud-detection

  • Hi Rupjit, welcome to CV! To improve your question, there are a few tips: 1) Don't write things irrelevant to the question (such as "this is my first question", "please don't downvote") - I removed these parts for you. 2) it seems like you are asking two separate things (how to handle missing features + how to handle timestamps) - try asking a single question per post. 3) Telling which kaggle competition you are participating in could also help answering the question. – Jan Kukacka Apr 20 '18 at 08:30
  • 1
    For more info just take the [tour] or see [How to ask a good question](https://stats.stackexchange.com/help/how-to-ask). – Jan Kukacka Apr 20 '18 at 08:31
  • I didn't want to cheat so i refrained from giving the link to the kaggle competition – Rupjit Chakraborty Apr 20 '18 at 10:35
  • Does the challenge explicitly forbid you to ask for help on sites like CV? If yes, you are cheating regardless if you write the name or not... – Jan Kukacka Apr 20 '18 at 11:31
  • No, but I was just being ethical. Alright, I will add the link – Rupjit Chakraborty Apr 20 '18 at 11:52

0 Answers0