TAGS :Viewed: 17 - Published at: a few seconds ago

[ How to build a numpy array row by row in a for loop? ]

This is basically what I am trying to do:

array = np.array()       #initialize the array. This is where the error code described below is thrown

for i in xrange(?):   #in the full version of this code, this loop goes through the length of a file. I won't know the length until I go through it. The point of the question is to see if you can build the array without knowing its exact size beforehand
    A = random.randint(0,10)
    B = random.randint(0,10)
    C = random.randint(0,10)
    D = random.randint(0,10)
    row = [A,B,C,D]
array[i:]= row        # this is supposed to add a row to the array with A,C,B,D as column values

This code doesn't work. First of all it complains: TypeError: Required argument 'object' (pos 1) not found. But I don't know the final size of the array.

Second, I know that last line is incorrect but I am not sure how to call this in python/numpy. So how can I do this?

Answer 1

A numpy array must be created with a fixed size. You can create a small one (e.g., one row) and then append rows one at a time, but that will be inefficient. There is no way to efficiently grow a numpy array gradually to an undetermined size. You need to decide ahead of time what size you want it to be, or accept that your code will be inefficient. Depending on the format of your data, you can possibly use something like numpy.loadtxt or various functions in pandas to read it in.