Covariance (or correlation or cosine) can be easily and naturally converted into euclidean distance by means of the law of cosines, because it is a scalar product (= angular-based similarity) in euclidean space. Knowing covariance between two variables i and j as well as their variances automatically implies knowing d between the variables: $d_{ij}^2 = \sigma_i^2 + \sigma_j^2 −2cov_{ij}$. (That $d_{ij}^2$ is directly proportional to the usual squared Euclidean distance: you obtain the latter if you use the sums-of-squares and the sum-of-crossproducts in place of the variances and the covariance. Both variables should be of course centered initially: speaking of "covariances" is alias to thinking about data with removed means.)
Note, this formula means that a negative covariance is greater distance than positive covariance (and this is indeed the case from the geometrical point of view, i.e. when the variables are seen as vectors in the subject space ). If you don't want the sign of the covariance to play role, abolish negative sign. Ignoring negative sign isn't "patching by hand" operation and is warranted, when needed: if cov matrix is positive definite, abs(cov) matrix will be positive definite too; and hence the distances obtained by the above formula will be true euclidean distances (euclidean distance is a particular sort of metric distance).
Euclidean distances are universal in respect to hierarchical clustering: any method of such clustering is valid with either euclidean or squared euclidean d. But some methods, e.g. average linkage or complete linkage, can be used with any dissimilarity or similarity (not just metric distances). So you could use such methods directly with cov or abs(cov) matrix or - just for example - with max(abs(cov))-abs(cov) distance matrix. Of course, clustering results do potentially depend on the exact nature of the (dis)similarity used.