Yes it is, if you think so.
Entropy is computed from an estimated probability distribution. There is no way to know what was the probabilities for the annotators to chose each category, if they didn't pick one. So you can assume that the unobserved distribution for each missing observation corresponds to the observed ones (those that were chosen by the other annotators).
This solution is natural if you think, as you may do, that if the annotator didn't decide any category, it was because he/she was willing to give this choice to someone else, ready to accept their decision. In that case, it makes completely sense to compute entropy simply discarding missing values, they are not important.
What if you don't think it this way?
In statistics, the choice of the best solution for some problem really depends on how you read data. Missings, more than anything, can be read in different ways.
You may think that the trivial method above biases estimates, because missing values express doubt, not tacit agreement to the other annotators' choice, whatever it was. In this case you could add a kind of bayesian prior to each distribution giving equal chances to each category, to do so just add three columns to you dataset, with all their values equal to one of the category. The computed entropies will all be higher, but this will affect mainly the instances with more missings, instead those with few missings will have their entropy raised less. This way the agreement among instances may be ranked in a different order, with a preference for instances with less missing categorizations.
Example:
| simple | adjusted
annotator_1 | annotator_2 | annotator_3 | annotator_4 | annotator_5 | entropy | entropy
-----------------------------------------------------------------------------------------
a a a a c 0.5 0.9
a a 0 0.95
Here the second row expresses better agreement if you simply drop the missings, but worse agreement when adding the "uncertainty prior", which is the three virtual different values.
Which one between these two solutions is better, only depends on your take on the data, unless this is part of some model of which performance you can assess on a hold-out set. I don't think this choice will change much anyway.