@Peter has given a good answer. I just want to add one point: it is important how the scale is presented or formatted.
For most people, the less the scale's notches are subscribed the more the scale is interval rather than ordinal. Compare
(disagree)| --- | --- | --- |(agree)
(disagree)1 --- 2 --- 3 --- 4(agree)
totally disagree --- rather disagree --- rather agree --- totally agree
where the 1st scale is just a grating to measure while the 3rd one is clearly categorical, ordinal. Labels involves verbal semantics what isolates the points away from being landmarks and towards being islands.
Osgood's rating scale (used in semantic differential) is like the 1st or 2nd above; in addition, it is bipolar - that is, two equally fair epithets (or objects) symmetrize the scale, which measures proximity to either of them. Such bipolar proximity measuring device seems to be further closer to interval away from ordinal, in comparison to a unipolar intensity measuring device (such as shown above), because symmetric opposition de-granulates the "landscape" between the opposites.
So, the way scale is typically presented in a semantic differential makes one think it is fairly interval.
With ordinal scale it is of course incorrect to do arithmetics (such as computing mean or summing to a total score) or check whether the data distribution is normal. The distinction between interval and ordinal implies the notion of underlying feature which is measured to produce an observed value. If the relation between the underlying and the observed is assumed to be linear, we speak of interval (equiinterval) scale. If the relation is assumed monotonic and is somehow known (e.g. postulated), then the scale is non-equiinterval; such a scale can be easily transformed into equiinterval.
If the relation between the underlying and the observed is assumed monotonic and unknown, there comes ordinal scale. Ordinal scale can be transformed into interval if the transformation rule is worked out. We may draw such rules from our pragmatic desire of maximizing some quantity in the analysis we conceive. For example, one might want linear correlations between items to be as strong as possible. Then the transformation which maximizes the correlations can be solved for. This process of quantifying categorical data is often referred to as optimal scaling.