I am a programmer, not a statistician, so pardon my botched use of the terms. My basic problem is this: I am wanting to calculate $R^2$ between a known concentration (which can be any non-negative value) and a discrete measurement (where the values are all integers). There are 92 possible observations, and each of these 92 has a known actual concentration. We wish to measure the accuracy of the measurements by looking at the $R^2$.
The different cases (each a different molecule) have concentrations which vary across orders of magnitude, so we use the log (base 2) of the values when calculating the $R^2$. This is an industry standard convention for this case.
In most cases, this works well. However, in cases where the measurements are relatively low, I am thinking that it may cause an artificially low $R^2$. For example, if the known concentration should result in a measurement of 0.1, since the only measurements possible are 0 or 1, then it will interpret this as error. Even if there are around 10 cases with concentrations that should give a measurement around 0.1, and we get one detection of one of them and 0 for the other 9, this will be interpreted as error.
For higher concentrations, this is obviously less of an issue (if the detections should have been 118.5 and we got either 118 or 119, it will correctly interpret this as not much error). However, I don't want to just make up my own correction for this.
My guess is there is some standard way of handling the calculation of $R^2$ between a continuous and a discrete variable. Can you point me at it?
I'm doing my calculations in Python using the scipy.stats module, but if you just know the name of the proper calculation method and don't know the python code that's perfectly fine.
p.s. To be more clear, there are 92 molecules of known concentration, and we are measuring their concentration using a technique. We want to know if a given measurement run went ok, and so the $R^2$ of the measurement run (which is discrete, i.e. how many counts do you have for that molecule) vs. the known true concentration (which is continuous) is being used to determine how this run's accuracy compares to other runs (for the exact same set of molecules). Hopefully the fact that it is always the same 92 x axis values (where y is the measurement), and we are comparing only one measurement run of this type to another, makes $R^2$ not too bad a metric to use here.