2010-10-28

Manipulating images in App Engine's Blobstore

In my previous post I showed how to take images uploaded to the Blobstore and resize them before storing them into the Datastore. This was used on PicSoup reliably for 6 months or so. I really wanted to find a solution that could use faster image serving services like Picasa but ran into terms of service issues.

Shortly after writing that blog entry App Engine SDK 1.3.6 was released along with the new fast image serving facilities. There are some good tutorials which explain this in detail so I won't repeat that here.

What I needed for Picsoup was to use the new image serving but to be able to manipulate the images in the Blobstore. This is so I can allow the user to upload images larger than 1MB but then resize them to 800x600 to store them. Typically a JPEG at this size is less than 100KB whereas the originals tend to be around 3MB. I also wanted to provide a rotation feature so the user can correct the image orientation after uploading.

The key to this problem is how to get an image out of the Blobstore, manipulate it, and put it back. The Blobstore API has no methods for writing directly to it, you can only write by uploading data through an HTTP POST and thus creating a new Blob. So the problem breaks down into three steps:
  1. Get the image from the Blobstore and manipulate it
  2. Upload the new image to the Blobstore
  3. Update Datastore references to the new Blob and remove the old Blob

Step 1: Get the image from the Blobstore and manipulate it

BlobKey bk = new BlobKey(ce.getBlobKey());
ImagesService imagesService = ImagesServiceFactory.getImagesService();
Image oldImage = ImagesServiceFactory.makeImageFromBlob(bk);
Transform rotate = ImagesServiceFactory.makeRotate(90);
Image image = imagesService.applyTransform(rotate, oldImage, ImagesService.OutputEncoding.JPEG);
sendToBlobStore(Long.toString(ce.getId()), "save", image.getImageData());
My domain object, a competition entry (ce), has a BlobKey string property. I use the ImageService to make an image from the blob and rotate it to create a new image.

Step 2: Upload the new image to the Blobstore

This is the step that I imagine the App Engine team will get around to adding to the API at some point. It would be nice to have a function to complement makeImageFromBlob(BlobKey) called makeBlobFromImage(Image). In the mean time I have written my own multipart/form-data post routine:
private static final boolean PRODUCTION_MODE = SystemProperty.environment.value() == SystemProperty.Environment.Value.Production;
    
private static final String URL_PREFIX = PRODUCTION_MODE ? "" : "http://127.0.0.1:8888";

private void sendToBlobStore(String id, String cmd, byte[] imageBytes) throws IOException {
    String urlStr = URL_PREFIX+BlobstoreServiceFactory.getBlobstoreService().createUploadUrl("/blobimage");
    URLFetchService urlFetch = URLFetchServiceFactory.getURLFetchService();
    HTTPRequest req = new HTTPRequest(new URL(urlStr), HTTPMethod.POST, FetchOptions.Builder.withDeadline(10.0));
    
    String boundary = makeBoundary();
    
    req.setHeader(new HTTPHeader("Content-Type","multipart/form-data; boundary=" + boundary));
    
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    
    write(baos, "--"+boundary+"\r\n");
    writeParameter(baos, "id", id);
    write(baos, "--"+boundary+"\r\n");
    writeImage(baos, cmd, imageBytes);
    write(baos, "--"+boundary+"--\r\n");

    req.setPayload(baos.toByteArray());
    try {
        urlFetch.fetch(req);
    } catch (IOException e) {
        // Need a better way of handling Timeout exceptions here - 10 second deadline
        logger.error("Possible timeout?",e);
    }        
}

private static Random random = new Random();    

private static String randomString() {
    return Long.toString(random.nextLong(), 36);
}

private String makeBoundary() {
    return "---------------------------" + randomString() + randomString() + randomString();
}        

private void write(OutputStream os, String s) throws IOException {
    os.write(s.getBytes());
}

private void writeParameter(OutputStream os, String name, String value) throws IOException {
    write(os, "Content-Disposition: form-data; name=\""+name+"\"\r\n\r\n"+value+"\r\n");
}

private void writeImage(OutputStream os, String name, byte[] bs) throws IOException {
    write(os, "Content-Disposition: form-data; name=\""+name+"\"; filename=\"image.jpg\"\r\n");
    write(os, "Content-Type: image/jpeg\r\n\r\n");
    os.write(bs);
    write(os, "\r\n");
}
The sendToBlobStore method takes three arguments:
  1. id - a domain object key id used to update the datastore reference to the new blob
  2. cmd - a command string used to determine how to handle the uploaded data
  3. imageBytes - a byte array of the new image that is to be uploaded
it then creates a multipart/form-data payload to send via the URLFetchService. The Deadline has been set to 10 seconds - the current maximum - but as you can see there is still a try..catch block around urlFetch.fetch(req) to catch timeouts. More about this later.

Step 3: Update Datastore references to the new Blob and remove the old Blob

The Blobstore calls back to "/blobimage" as defined earlier in sendToBlobStore when it has finished storing the new blob. So a doPost method is required to handle the incoming callback. When we have finished processing the callback we have to send a redirect and therefore we have to have a servlet request handler ready to respond as well. A possible quirk I've noticed here is that the browser follows the redirect via a GET whereas the URLFetchService follows it with another POST request, therefore the handler has to be available for both.
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
    resp.getWriter().write("SUCCESS");
}

public void doPost(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {
    // Handle post requests
    String qcmd = req.getParameter("qcmd");        
    if ("success".equals(qcmd)) {
        res.getWriter().write("SUCCESS");
        return;
    }
    
    // Handle upload callbacks
    Map<String, BlobKey> blobs = blobstoreService.getUploadedBlobs(req);
    if (blobs.isEmpty()) {
        throw new ServletException("UploadedBlobs map is empty");
    }
    Entry<String, BlobKey> entry = blobs.entrySet().iterator().next();
    
    String handler = entry.getKey();
    BlobKey blobKey = entry.getValue();
    
    if ("upload".equals(handler)) {
        initialUploadHandler(res, blobKey);
    } else if ("save".equals(handler)) {
        saveHandler(req, blobKey);
        res.sendRedirect(SUCCESS_RESULT);
    } else {
        throw new ServletException("Invalid handler request ["+handler+"]");
    }
}
Here you can see that my handlers for the redirects just send the word SUCCESS. My GWT code reads this and then makes further RPCs to update the front-end. The section to explain here is under the "Handle upload callbacks" comment. What I'm doing here is simply taking the first entry from the UploadedBlobs map and using the key to determine how to process the callback. The key is the "cmd" parameter we passed in earlier to the sendToBlobStore method. I have removed a few handlers from this example for brevity but you can see here how I can have different processing for an initial upload from a browser versus an internal upload following a rotate transformation.

The rotate operation we ran in Step 1 passed in the cmd "save" meaning the saveHandler is called:
private void saveHandler(HttpServletRequest req, BlobKey blobKey) {
    Long compEntryId = new Long(req.getParameter("id"));
    logger.info("Incoming image to save: ["+blobKey.getKeyString()+"] id=["+compEntryId+"]");

    CompEntry ce = dao.getCompEntry(new Key<CompEntry>(CompEntry.class, compEntryId));
    if (ce != null) {
        String oldBlobKey = ce.getBlobKey();
        
        ce.setBlobKey(blobKey.getKeyString());
        ce.setServingUrl(getServingUrl(blobKey));
        ce.setResized(true);
        dao.ofy().put(ce);
        
        // Delete the old Blob
        if (oldBlobKey != null) {
            blobstoreService.delete(new BlobKey(oldBlobKey));
        }
    }
}

private String getServingUrl(BlobKey blobKey) {
    String servingUrl = ImagesServiceFactory.getImagesService().getServingUrl(blobKey);
    // Hack for Dev mode
    if (PRODUCTION_MODE) {
        return servingUrl;
    } else {
        return servingUrl.replaceFirst("http://0.0.0.0:8888", "");                    
    }
}
In saveHandler there is a little bit of Objectify code to update the datastore object to reference the new blob. The old blob is then deleted. Note my little hack in getServingUrl to iron out a difference between the Development and Production environments.

Timeouts

I arrived at the design above following a number of experiments. The main problem that shapes the solution this way is the URLFetchService timeout. The maximum deadline at the moment is 10 seconds which seems like plenty of time but an IOException for a timeout is regularly thrown. For some reason (any explanation gratefully received) when there is only 1 instance of the app running in production the deadline is always reached. As soon as there are 2 or more instances running this stops happening. Unfortunately the exception thrown is just an IOException and not something more specific like URLFetchDeadlineExceededException which would be much nicer. On the development server this timeout is never reached.

To get around this timeout issue you just have to make sure that any critical code goes into the Blobstore callback handler. For example, I save the change to the domain object in saveHandler and not in my original call in Step 1. In my GWT front-end I have routines to check that the transformation is complete and show a spinner while waiting.

Picsoup is now using this code, go and check it out!