This is an open research question. See for example the work by Valera et al. (paper) or extensions (e.g. one by Dhir et al. - paper).
Edit:
A common practice in statistics and machine
learning is to assume that the statistical data types
(e.g., ordinal, categorical or real-valued) of variables,
and usually, also the likelihood model is
known. However, as the availability of real-world
data increases, this assumption becomes
too restrictive. Data are often heterogeneous,
complex, and improperly or incompletely documented.
Surprisingly, despite their practical
importance, there is still a lack of tools to automatically
discover the statistical types of, as
well as appropriate likelihood (noise) models for,
the variables in a dataset.
(From the Valera paper.)
So when we say that this is an "open question" (oddly enough quoting myself), we mean to say that currently there are no good automatic methods for inferring the type of data given a finite sample. If you had an infinite sample this would be easy, but since that is not possible, we need to revert to other means.