OK, a new answer to an old question but even more relevant now. The question you're asking has to do with finite precision, normally the domain of signal analysis and experimental mathematics.
Double precision (DP) floats let us pretend that finite precision problems don't exist, the same as we do with most real-world mathematical problems. In experimental math there is no pretending.
Single precision (SP) floats force us to consider quantization noise. If our machine learning models inherently reject noise, such as neural nets (NN), convolutional nets (CNN), residual nets (ResN), etc, then SP most often gives similar results to DP.
Half precision (HP) floats (now supported in cuda toolkit 7.5) require that quantization effects (noise and rounding) be considered. Most likely we'll soon see HP floats in the common machine learning toolkits.
There is recent work to create lower precision computations in floats as well as fixed precision numbers. Stochastic rounding has enabled convergence to procede with CNNs whereas the solution diverges without it. These papers will help you to improve your understanding of the problems with the use of finite precision numbers in machine learning.
To address your questions:
SP is not so bad. As you point out it's twice as fast, but it also allows you to put more layers into memory. A bonus is in saving overhead getting data on and off the gpu. The faster computations and the lower overhead result in lower convergence times. That said, HP, for some problems, will be better in some parts of the network and not in others.
- It seems to me that many of the machine learning toolkits handle SPs and DPs. Perhaps someone else with a wider range of experience with the toolkits will add their nickle.
- Python will support what the gpu toolkit supports. You don't want to use python data types because then you'll be running an interpreted script on the cpu.
Note that the trend in neural networks now is to go with very deep layers, with runs of more than a few days common on the fastest gpu clusters.