Let's say I have a dataset where each item is labeled with either (1) true positive or (2) unknown (could be true positive, could be true negative).
It seems like if there are only true positives labeled, the only penalty you can impose is if negative is predicted for a true positive case. In that scenario, a model that predicts true for every item will have a perfect score.
Two questions:
Are there metrics to assess models or literature about being able to use this data with only true positive labels to build models?
Are there additional pieces of data that can make the data usable? For example, labeling a small set of true negatives or knowing the overall population rate of positive cases.