#### [ Python gamma.fit returning values that don't seem to give correct distribution in excel ]

I have a series of experimental data values *X* and *Y* which are used to produce a scatter graph, this scatter graph looks very similar to a gamma distribution and I have read papers saying that this experimental data can represented/modeled using a gamma distribution.

So I have written the following bit of python code to find the gamma distributions constants:

```
import csv
import random
import scipy as sp
import scipy.stats as ss
from collections import defaultdict
columns = defaultdict(list)
with open('case_1_RTD.csv') as f:
reader=csv.reader(f)
reader.next()
for row in reader:
for(i,v) in enumerate(row):
columns[i].append(v)
X=(columns[0])
Y=(columns[1])
data=[float(i) for i in Y]
alpha= []
beta=[]
loc=[]
alpha,loc,beta=ss.gamma.fit(data, floc=0)
print (alpha,loc,beta)
```

I then use the outputs from this to generate a gamma distribution in Excel and compare this new Gamma distribution data with the original *X, Y* data. The sets of data values are not a like at all.

In excel I use the function

```
=Gamma.Dist(X,alpha,beta,False) #I have tried switching alpha and beta around but no luck
```

The fact that I do not use the *X* data set in the python code is a bit disconcerting, but from what I have read in the Scipy documentation I cannot see where to use it. Does this have something to do with `loc`

variable in python? (from what i have read it does not)

The *X,Y* data sets contain 3718 values withe smallest *Y* value being 1.11E-297 could this be causing an issue?

Thanks in advance for any help or guidance

# Answer 1

You seem to be looking to model $Y$ as a non-linear function of $X$, $Y=f(X)$, and not trying to estimate the distribution of $Y$. Apparently from theoretical considerations $f$ is a non-negative function with area under the curve of 1 with an exponentially decaying tail (Wikipedia article on residence time distribution), so you want to use a probability density function, specifically the Gamma distribution pdf.

This is not a distribution fitting problem, but rather a non-linear regression problem. I have no idea how to do it in Python, but a quick search for these keywords brought up a promising link.