We are trying to generate a model to summarize Persian news. About 14000 news were summarized with help of humans(supervised) and then we extracted all sentences (about 180000) and labeled them (true
if were selected in summarization, false
if not). We have also calculated 9 features for these sentences (all features are between 0-1). Finally we used MaxEnt classifier (logistic regression) for binary classification on our data set.
I really don't know low recall and high precision is normal for our classifier or something is wrong in our work? Can anyone come up with an explanation?