[ Using yield with multiple ndb.get_multi_async ]
I am trying to improve efficiency of my current query from appengine datastore. Currently, I am using a synchronous method:
class Hospital(ndb.Model):
name = ndb.StringProperty()
buildings= ndb.KeyProperty(kind=Building,repeated=True)
class Building(ndb.Model):
name = ndb.StringProperty()
rooms= ndb.KeyProperty(kind=Room,repeated=True)
class Room(ndb.Model):
name = ndb.StringProperty()
beds = ndb.KeyProperty(kind=Bed,repeated=True)
class Bed(ndb.Model):
name = ndb.StringProperty()
.....
Currently I go through stupidly:
currhosp = ndb.Key(urlsafe=valid_hosp_key).get()
nbuilds = ndb.get_multi(currhosp.buildings)
for b in nbuilds:
rms = ndb.get_multi(b.rooms)
for r in rms:
bds = ndb.get_multi(r.beds)
for b in bds:
do something with b object
I would like to transform this into a much faster query using get_multi_async
My difficulty is in how I can do this? Any ideas?
Best Jon
Answer 1
using the given structures above, it is possible, and was confirmed that you can solve this with a set of tasklets. It is a SIGNIFICANT speed up over the iterative method.
@ndb.tasklet
def get_bed_info(bed_key):
bed_info = {}
bed = yield bed_key.get_async()
format and store bed information into bed_info
raise ndb.Return(bed_info)
@nbd.tasklet
def get_room_info(room_key):
room_info = {}
room = yield room_key.get_async()
beds = yield map(get_bed_info,room.beds)
store room info in room_info
room_info["beds"] = beds
raise ndb.Return(room_info)
@ndb.tasklet
def get_building_info(build_key):
build_info = {}
building = yield build_key.get_async()
rooms = yield map(get_room_info,building.rooms)
store building info in build_info
build_info["rooms"] = rooms
raise ndb.Return(build_info)
@ndb.toplevel
def get_hospital_buildings(hospital_object):
buildings = yield map(get_building_info,hospital_object.buildings)
raise ndb.Return(buildings)
Now comes the main call from the hospital function where you have the hospital object (hosp).
hosp_info = {}
buildings = get_hospital_buildings(hospital_obj)
store hospital info in hosp_info
hosp_info["buildings"] = buildings
return hosp_info
There you go! It is incredibly efficient and lets the schedule complete all the information in the fastest possible manner within the GAE backbone.
Answer 2
You can do something with query.map(). See https://developers.google.com/appengine/docs/python/ndb/async#tasklets and https://developers.google.com/appengine/docs/python/ndb/queryclass#Query_map