google app engine - Which NDB query function is more efficient to iterate through a big set of query results? -


i use ndb app , use iter() limit , starting cursor iterate through 20,000 query results in task. lot of time run timeout error.

timeout: datastore operation timed out, or data temporarily unavailable.

the way make call this:

results = query.iter(limit=20000, start_cursor=cursor, produce_cursors=true) item in results:   process(item) save_cursor_for_next_time(results.cursor_after().urlsafe()) 

i can reduce limit thought task can run long 10 mins. 10 mins should more enough time go through 20000 results. in fact, on run, task can complete in minute.

if switched fetch() or fetch_page(), more efficient , less run timeout error? suspect there's lot of overhead in iter() causes timeout error.

thanks.

fetch not more efficient use same mechanism, unless know how many entities want upfront - fetch can more efficient end 1 round trip.

you can increase batch size iter, can improve things. see https://developers.google.com/appengine/docs/python/ndb/queryclass#kwdargs_options

from docs default batch size 20, mean 20,000 entities lot of batches.

other things can help. consider using map , or map_async on processing, rather explicitly calling process(entity) have read https://developers.google.com/appengine/docs/python/ndb/queries#map introducing async processing can mean improved concurrency.

having said of should profile can understand time used. instance delays in process due things doing there.


Comments

Popular posts from this blog

javascript - DIV "hiding" when changing dropdown value -

Does Firefox offer AppleScript support to get URL of windows? -

android - How to install packaged app on Firefox for mobile? -