I put this together to solve a problem where contiguous data generated in JavaScript on the browser side needed to be broken into pieces and sent to the server and reconstructed there. The same tool could be used to buffer rapidly incoming small objects to queue for batch inserts for higher performance / less contention on the Appengine datastore.
Jump to the code. I apologize for the CSS failfailfail.
For the “large object out of many small fragments” use case:
The principle is the sender provides some kind of identifier that uniquely identifies the batch. I use a cookie that is generated on page load combined with a counter in the JS that is incremented once for each batch from that client. The cookie was conveniently already there for another purpose.
On the server, the library takes this identifier and uses it as the memcache root key or “groupkey” in the code. Each fragment sent should include the total expected, and the index in the array of fragments of the current fragment. On the server an atomic global counter keeps track of when all pieces are sent. The api allows you to use the library to preserve the array order, or you can just stick order info in the stored value and figure it out yourself when the server gives you the completed array of accumulated fragments. Very simply, the server uses the unique identifier both to store the fragments in memcache and to keep track of the fragment count, which when equal to the expected count raises a Complete exception, allowing you to fetch back an array of all the fragments. Typically this would be done inline with the request that’s sending the last missing fragment.
For the case of queueing up small objects for batch insert into BigTable:
This use case is obvious, and analysis is left to the reader as exercise (smile). Basically, the above concern about order is removed, and the reconstruction is unneeded.