Jim Blackler

Viewing raw XML traffic when using Google Data Java Client APIs

Posted by jimblackler on Oct 23, 2008

The Google Data Java Client APIs provide a convenient way for Java prorgams to interface with Google’s public Data APIs, for Google calendars, contacts, and many other services.

It’s such a good abstraction, however, there is no clear way of seeing the underlying XML of the feed. When debugging, and for their interest, programmers may wish to see this data – it’s supposed to be human-readable after all.

After at least an hour of playing around I found the commands to enable the logging in the client’s initialization. Traffic will then be output to the console at run time.

The gotcha is that enabling verbose output on Java’s logging requires not just the logs to be set to Level.ALL, but also the underlying handlers.

Here’s the code.

import java.util.logging.Logger;
import com.google.gdata.client.http.GoogleGDataRequest;
import com.google.gdata.client.http.HttpGDataRequest;

Logger.getLogger(Logger.GLOBAL_LOGGER_NAME).getParent().getHandlers()[0].setLevel(Level.ALL);
Logger.getLogger(GoogleGDataRequest.class.getName()).setLevel(Level.ALL);
Logger.getLogger(HttpGDataRequest.class.getName()).setLevel(Level.ALL);

Posted in General programming || No Comments »

Bringing ‘yield return’ from C# to Java

Posted by jimblackler on Oct 12, 2008

Consider a function that collects and returns a list of results. It might look like this:

public ArrayList<String> retrieveAll(long requesterId);

Or even better, this, so it could use anything that implements Iterable<>:

public Iterable<String> retrieveAll(long requesterId);

However, storing the results in a container which is then returned may be inefficient for the application. It may be better to enable the calling code to act immediately on each collected result rather than first waiting for all results to be collected and stored. If that were possible..

No memory would be used to store a list.
Results could be presented to the user straight away.
Calling code could abort the collecting process part way through based on its own logic – for instance if it already has more results than it can handle.

Custom iterators

One way to avoid an intermediate list is for the collecting function to construct a custom Iterable<> object, which constructs Iterator<> objects that contain the collecting logic. Each execution of next() calculates the next value and returns it straight away for the calling code.

The difficulty for the programmer is that this may require a significantly different structure of the collecting code. This code has to store any state from result to result as private variables inside the iterator. Call stack is reset with each result, so algorithms that use recursion are not possible.

More convenient for the programmer would be a technique that allows the collecting code to keep control of the machine state during the collecting process.

Yield return

C# has ‘yield return’ which allows a function to be structured exactly as if it is building a list. It may keep its own state in local variables and the call stack, but still present each value to the calling code immediately as it is found.

Java doesn’t have yield return, but another pattern that could be used is this:

public void retrieveAll(long requesterId, ResultProcessor<String> collector);

Where ResultHandler<> is a simple interface to an object invoked by the collecting code to return individual results as they are collected.

public interface ResultHandler<T> {
void handleResult(T value) throws CollectionAbortedException;
}

It is left up to the calling code what ResultHandler<> implementing object to supply, and how to handle the data. Anonymous classes would be quite neat here. For instance, a caller could output the results to a console like so:

retreiveAll(myId, new ResultHandler<String>(){
public void handleResult(String value) {
System.out.println(value);
}
});

The collect() function may abort the collecting operation part way through by throwing a CollectionAbortedException.

The disadvantage to this approach is that it is a little less convenient from the perspective of the calling code. The function has a novel pattern, taking ResultHandler<> as a parameter rather than the more familiar Iterable<> as a return value. Logic has to be embedded inside an anonymous or specially constructed ResultHandler<> class.

Unlike the case of returned Iterables, the calling code can not use ‘for each’ loops on the result. Nor can it store or pass around the Iterable<> to other systems as a parameter.

Solution : The yield adapter

A best-of-both-worlds solution would allow a collecting method based on ResultHandler<> to be automatically adapted at run time to an implementation based on Iterable<>. A data-collecting function implemented in this form (convenient for the collecting code) …

public void retrieveAll(long requesterId, ResultHandler<String> collector);

.. would be adapted into this form (convenient for the calling code)..

public Iterable<String> retrieveAll(long requesterId);

This would provide the same benefit C# programs get from yield return. Namely, the ability for both collecting code and calling code to have their own machine state and callstack control throughout the entire collecting and processing operation. It is not necessary for either side to separate logic into methods inside anonymous classes. Collecting code can use complex recursion algorithms. Calling code can store and defer use of the Iterable<> used to collect results, or pass them as parameters to other functions.

Approach	Collector controls flow	Caller controls flow	Requires list	Caller can abort collect	Uses new thread
Collector builds and returns list in form of Iterable<>	yes	yes	yes	no	no
Collector returns Iterable<> containing logic	no	yes	no	yes	no
Caller passes in ResultHandler<>	yes	no	no	yes	no
ResultHandler<> wrapped to Iterable<> with yield adapter.	yes	yes	no	yes	yes

How it works

I am not the first engineer to attempt to bring yield to Java. Aviad Ben Dov’s article in 2007 http://chaoticjava.com/posts/java-yield-return-code-published/ describes a way to do this using bytecode manipulation and classloader modification. My view is that a solution in Java alone would be more practical as a portable library.

My starting premise was that if calling code and collecting code are both to have their own call stack, this could only be achieved with the use of multiple threads.

In addition, I was aware of a Java collection SynchronousQueue which is designed to allow two threads to pass values between each other, each in turn yielding control to the other.

The Yield Adapter simply makes a new custom Iterable<> which is returned to the calling code. The iterable creates iterators on demand, which also starts a new thread for the collection, and makes a SynchronousQueue for communication between the two. Results are wrapped in message object and, .put() into the queue on the collecting side. The iterator side uses .take(). There is also some extra logic to ensure that incomplete reads do not result in leaked resources.

Please note that although the adapter uses threads, the code does not normally need to be ‘thread safe’ in the classic sense. That is because there is never a time that both threads are executing simultaneously. One thread always ‘yields’ to the other. The exception to this is if multiple iterators were ever in effect at once.

Source and demos

The adapter source can be found here with the interfaces employed all here.

A demo here shows how the adapter can be used to allow a recursive scan of files and directories, returning the result through an iterator.

A more complex demo here returns all possible ways the letters in a word can be rearranged, without repeats.

The library source can be downloaded with Subversion or your browser here and the tests and demos are here.

Feedback

I hope that this small library will allow people to develop complex algorithms that present results to calling code as iterators.

If you have any comments on the code, please add them as comments on this article.

Update Feb 2009

Many thanks to Dominic Lachowicz who brought to my attention the fact that the SynchronousQueue does not in fact stop both threads running at the same time. A revision checked in today uses a Semaphore object to achieve the desired behaviour.

Posted in General programming || 8 Comments »

Archives

Categories