TAGS :Viewed: 7 - Published at: a few seconds ago

[ Find irregular region in 4D numpy array of gridded data (lat/lon) ]

I have a large 4-dimensional dataset of Temperatures [time,pressure,lat,lon]. I need to find all grid points within a region defined by lat/lon indices and calculate an average over the region to leave me with a 2-dimensional array.

I know how to do this if my region is a rectangle (or square) but how can this be done with an irregular polygon?

Below is an image showing the regions I need to average together and the lat/lon grid the data is gridded to in the array

lat/lon grid with regions to average

Answer 1

The general class of problems is called "Point in Polygon", where the (fairly) standard algorithm is based on drawing a test line through the point under consideration and counting the number of times it crosses polygon boundaries (its really cool/weird that it works so simply, I think). This is a really good overview which includes implementation information.

For your problem in particular, since each of your regions are defined based on a small number of square cells - I think a more brute-force approach might be better. Perhaps something like:

  • For each region, form a list of all of the (lat/lon) squares which define it. Depending on how your regions are defined, this may be trivial, or annoying...

  • For each point you are examining, figure out which square it lives in. Since the squares are so well behaves, you can do this manually using opposite corners of each square, or using a method like numpy.digitize.

  • Test whether the square the point lives in, is in one of the regions.

If you're still having trouble, please provide some more details about your problem (specifically, how your regions are defined) --- that will make it easier to offer advice.

Answer 2

I believe this should solve your problem.

The code below generates all cells in a polygon defined by a list of vertices. It "scans" the polygon row by row keeping track of the transition columns where you (re)-enter or exit the polygon.

def row(x, transitions):
    """ generator spitting all cells in a row given a list of transition (in/out) columns."""

    i = 1
    in_poly = True
    y = transitions[0]
    while i < len(transitions):
        if in_poly:
            while y < transitions[i]:
                yield (x,y)
                y += 1
            in_poly = False
            in_poly = True
            y = transitions[i]
        i += 1

def get_same_row_vert(i, vertices):
    """ find all vertex columns in the same row as vertices[i], and return next vertex index as well."""

    vert = []
    x = vertices[i][0]
    while i < len(vertices) and vertices[i][0] == x:
        i += 1
    return vert, i

def update_transitions(old, new):
    """ update old transition columns for a row given new vertices. 

    That is: merge both lists and remove duplicate values (2 transitions at the same column cancel each other)"""

    if old == []:
        return new
    if new == []:
        return old
    o0 = old[0]
    n0 = new[0]
    if o0 == n0:
        return update_transitions(old[1:], new[1:])
    if o0 < n0:
        return [o0] + update_transitions(old[1:], new)
    return [n0] + update_transitions(old, new[1:])

def polygon(vertices):
    """ generator spitting all cells in the polygon defined by given vertices."""

    x = vertices[0][0]
    transitions, i = get_same_row_vert(0, vertices)
    while i < len(vertices):
        while x < vertices[i][0]:            
            for cell in row(x, transitions):
                yield cell
            x += 1
        vert, i = get_same_row_vert(i, vertices)
        transitions = update_transitions(transitions, vert)

# define a "strange" polygon (hook shaped)
vertices = [(0,0),(0,3),(4,3),(4,0),(3,0),(3,2),(1,2),(1,1),(2,1),(2,0)]

for cell in polygon(vertices):
    print cell
    # or do whatever you need to do