I have data that looks like the below. I would like to create a model that could answer the question: "If I have a data point with distance x and y people signed up, how many should I expect to check in?" (subject to all the usual hedges like not extrapolating or being too confident in the results, of course).
Signed Up Checked In Yield Rate Distance (km)
274 171 62.41% 0
241 44 18.26% 475.9156416
132 22 16.67% 342.732219
123 53 43.09% 457.3099693
116 20 17.24% 833.4106358
41 20 48.78% 51.19124239
1 0 0.00% 2833.297793
1 0 0.00% 388.5309437
1 0 0.00% 1069.432695
1 1 100.00% 929.646838
1 0 0.00% 1103.6347
Note that yield rate is just (Checked In)/(Signed Up). I tried a basic linear correlation but for pretty obvious reasons, that won't work. What should I do? I've heard of pretty much all the big technologies (R, Python, TensorFlow) but I have very little experience in this space. I'm open to learning though!
Sorry about the poor tagging: I'm so lost with this problem that I'm not even sure what type of problem I'm trying to solve.