I have an application where I need to measure the similarity between the (TF-IDF?) representation of two documents: $\mathbf{a}$ and $\mathbf{b}$ while still taking the document length into account. More specifically, if the document $a$ is contained within a much larger document $b$ then I do not want the similarity to decrease significantly, and ideally I would want $\texttt{sim}(\mathbf{a}, \mathbf{a}) \approx \texttt{sim}(\mathbf{a}, \mathbf{b})$.
I was thinking of using cosine similarity without the length normalizations , i.e.
$\texttt{sim}(\mathbf{a}, \mathbf{b}) = \mathbf{a}^T \mathbf{b} * K$
where $K$ is a normalizing constant independent of $a$ and $b$.
Is there a better way to achieve this?