Wednesday, April 25, 2012

Fetching arbitrary object graphs in JPA 2

In Java EE, JPA (Java Persistence API) is used to store and retrieve graphs of objects. This works by specifying relations between objects via annotations (or optionally XML). Hand over the root of an object graph to the entity manager and it will persist it. Ask the entity manager for an object with a given Id and you'll get the graph back.

This is all fine and well, but how in this model do we control which branches of the graph are retrieved and to which depth branches should be followed?

The primary mechanism to control this with is the eager/lazy mechanism. Mark a relation as eager and JPA will fetch it upfront, mark it as lazy and it will dynamically fetch it when the relation is traversed. In practice, both approaches have their cons and pros. Mark everything eager and you'll risk pulling in the entire DB for every little bit of data that you need. Mark everything lazy, and you'll not only have to keep the persistence context around (which by itself can be troublesome), but you also risk running into the 1 + N query problem (1 base query is fired, and then an unknown amount of N queries when iterating over its relations). If fetching 1000 items in one query took approximately as long as fetching 1 item per query and firing 1000 queries, then this wouldn't be a problem. Unfortunately, for a relational database this is not the case, not even when using heaps of memory and tons of fast SSDs in RAID.

There are various ways to overcome this. For instance there are proprietary mechanisms for setting the batch size, so not 1000 queries are fired but 10. We could also assume that all entities relating to those 1000 items are all in the (JPA) cache. Then 1000 fetches of 1 entity are indeed about as costly as 1 fetch of 1000 entities, but this is a dangerous assumption. Assume wrong and you might bring down your DB.

The fundamental problem however is that eager/lazy are static properties of the entity model. In practice, the part of the graph that you want often depends on the use case. For a master overview of all Users in a system, you'd probably want a rather shallow graph, but for the detail view of a particular User you most likely need a somewhat deeper one.

Again, there are various solutions for this. One is to write individual JPQL queries for each use case. This certainly works, but the number of queries can grow rapidly out of hand this way (allUsersWithAddress, allUsersWithAddressAndFriends, allUsersWithAddressAndFriendsWithAddress , ...). Another solution that addresses exactly this problem are the fetch profiles that were introduced in Hibernate 3.5. As can be seen in the official documentation, this solution is not particularly JPA friendly. You need access to the native Hibernate session, which is possible, but not pretty. One way or the other, fetch profiles are Hibernate specific.

In this posting I would like to present an alternative solution. It feels a little like fetch profiles, but the graph to be fetched can be specified dynamically and it uses the JPA API only. It works by using the criteria API to programmatically add one or more JOIN FETCH clauses to a query. Unfortunately JPA does not yet have the capabilities to turn a JPQL query into a Criteria query, so either the query must already be a Criteria or it should be a simple find. The following code demonstrates the latter case:

@Stateless
public class SomeDAO {
    
    @PersistenceContext
    private EntityManager entityManager;

    public T findWithDepth(Class<T> type, Object id, String... fetchRelations) {
        CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
        CriteriaQuery<T> criteriaQuery = criteriaBuilder.createQuery(type);
        Root<T> root = criteriaQuery.from(type);

        for (String relation : fetchRelations) {
            FetchParent<T, T> fetch = root;
            for (String pathSegment : relation.split(quote("."))) {
                fetch = fetch.fetch(pathSegment, JoinType.LEFT);
            }
        }

        criteriaQuery.where(criteriaBuilder.equal(root.get("id"), id));

        return getSingleOrNoneResult(entityManager.createQuery(criteriaQuery));
    }

    private <T> T getSingleOrNoneResult(TypedQuery<T> query) {        
        query.setMaxResults(1);
        List<T> result = query.getResultList();
        if (result.isEmpty()) {
            return null;
        }
        
        return result.get(0);
    }
}

The findWithDepth method can now be called with one or more path expressions, where each path is just a chain of properties separated by a dot (like in expression language). E.g.:

User user = someDao.findWithDepth(
    User.class, 15, "addresses", "friends.addresses"
);

The above line would fetch the user with "id" 15, and pre-fetches the addresses associated with that user, as well as the friends and their addresses. (Note that the @Id field is hardcoded to be called "id" here. A more fancy implementation could query the object for it)

This solution, though handy, is however not perfect. While all JPA vendors support fetching multiple relations of one level deep (addresses and friends in the example above), not all of them support fetching chained relations (friends.addresses in the example above). Specifically for Hibernate care should be taken to avoid fetching so-called "multiple bags" (sets and @OrderColumn are a typical solution). Of course it's always wise to avoid creating a huge Cartesian product, which is unfortunately one low-level effect of the underlying relational DB you have to be aware of, even when purely dealing with object graphs.

Despite the problems I outlined with this approach above, I hope it's still useful to someone. Thanks go to my co-workers Jan Beernink and Hongqin Chen for coming up with the original idea respectively refining it.

Arjan Tijms

No comments:

Post a Comment