TAGS :Viewed: 3 - Published at: a few seconds ago

[ Using yield with multiple ndb.get_multi_async ]

I am trying to improve efficiency of my current query from appengine datastore. Currently, I am using a synchronous method:

class Hospital(ndb.Model):
      name = ndb.StringProperty()
      buildings= ndb.KeyProperty(kind=Building,repeated=True)
class Building(ndb.Model):
      name = ndb.StringProperty()
      rooms= ndb.KeyProperty(kind=Room,repeated=True)
class Room(ndb.Model):
      name = ndb.StringProperty()
      beds = ndb.KeyProperty(kind=Bed,repeated=True)
class Bed(ndb.Model):
      name = ndb.StringProperty()

Currently I go through stupidly:

currhosp = ndb.Key(urlsafe=valid_hosp_key).get()
nbuilds = ndb.get_multi(currhosp.buildings)
for b in nbuilds:
   rms = ndb.get_multi(b.rooms)
   for r in rms:
      bds = ndb.get_multi(r.beds)
      for b in bds:
          do something with b object

I would like to transform this into a much faster query using get_multi_async

My difficulty is in how I can do this? Any ideas?

Best Jon

Answer 1

using the given structures above, it is possible, and was confirmed that you can solve this with a set of tasklets. It is a SIGNIFICANT speed up over the iterative method.

def get_bed_info(bed_key):
    bed_info = {}
    bed = yield bed_key.get_async()
    format and store bed information into bed_info
    raise ndb.Return(bed_info)

def get_room_info(room_key):
    room_info = {}
    room = yield room_key.get_async()
    beds = yield map(get_bed_info,room.beds)
    store room info in room_info
    room_info["beds"] = beds
    raise ndb.Return(room_info)

def get_building_info(build_key):
    build_info = {}
    building = yield build_key.get_async()
    rooms = yield map(get_room_info,building.rooms)
    store building info in build_info
    build_info["rooms"] = rooms
    raise ndb.Return(build_info)

def get_hospital_buildings(hospital_object):
    buildings = yield map(get_building_info,hospital_object.buildings)
    raise ndb.Return(buildings)

Now comes the main call from the hospital function where you have the hospital object (hosp).

hosp_info = {}
buildings = get_hospital_buildings(hospital_obj)
store hospital info in hosp_info
hosp_info["buildings"] = buildings
return hosp_info

There you go! It is incredibly efficient and lets the schedule complete all the information in the fastest possible manner within the GAE backbone.

Answer 2

You can do something with query.map(). See https://developers.google.com/appengine/docs/python/ndb/async#tasklets and https://developers.google.com/appengine/docs/python/ndb/queryclass#Query_map