Counting the rows returned from a JPA query

Despite being almost ten years old, the JPA specification to this day has rather poor support for basic paging/sorting/filtering. Paging/sorting/filtering is used in a lot of (CRUD) applications where the result from a query is shown in a table, and where the user can scroll through the results one page at a time, and where this result can be sorted by clicking on any of the table column headers.

In order to support this a number of things are generally needed:

  1. The total number of rows (or entities) in the full result must be known
  2. There should be support for an offset in the full result and a limit for the amount of rows that will be obtained
  3. The column (attribute) on which to sort must be dynamically added to the query
  4. Search expressions must be dynamically added to the query

As it appears, only offset/limit is directly supported in JPA. A sorting column can only be added dynamically when using the overly verbose and hard to work with Criteria API. Search expressions are somewhat possible to add via the Criteria API as well, but it's an awkward and rather poor mechanism.

Surprisingly, universally counting the number of rows is not possible at all in JPA. In this article we'll look at a very hairy workaround for this using Hibernate specific code.

The core problem is that JPA does not allow subqueries in a SELECT. This makes it impossible to solve this issue in a straightforward manner like:

SELECT 
    COUNT(_user)
FROM (
    SELECT 
        DISTINCT _user
    FROM 
        User _user
    JOIN 
        _user.roles _role
    WHERE 
        _role.id IN (1)
)

For some classes of queries the query can be rewritten to yield the same effect by putting the subquery in the WHERE clause, which is supported by JPA:

SELECT 
    COUNT(_user)
FROM
    User _user
WHERE
    _user IN (
        SELECT 
            DISTINCT _user
        FROM 
            User _user
        JOIN 
            _user.roles _role
        WHERE 
            _role.id IN (1)
    )

Unfortunately this trick does not work when aggregation is used, with or without constructor expressions. Consider for example the following JPQL query:

SELECT
     new com.example.AggregatedStatistic(
          SUM(_statistic.views),
          SUM(_statistic.clicks),
          _statistic.date
     )
FROM
     Statistic _statistic
WHERE
     _statistic.date >= :startDate AND
     _statistic.date <= :endDate
GROUP BY
     _statistic.date

Strange as it may seem, this query is uncountable in JPA, while in SQL this is usually not a problem. So what we could do is generate the corresponding SQL query, surround it by an outer count(*) query and then execute that.

But here we hit another wall. While by definition every JPA implementation must be able to generate SQL from a JPA query, there's no actual standard API to get just this query text.

Now one particular aspect of JPA is that it's almost never a pure implementation (such as e.g. JSF), but a standardization API layered on top of another API. This other API is typically richer. In the case of Hibernate there indeed appears to be a public API available to do the transformation that we need, including handling query parameters (if any).

To demonstrate this, let's first create the Query object in Java. Here we assume that the JPQL query shown above is available as a query named "Statistic.perDate":

TypedQuery<Statistic> typedQuery = 
    entityManager.createNamedQuery("Statistic.perDate", Statistic.class)
                 .setParameter("startDate", twoMonthsBack())
                 .setParameter("endDate", now());

From this typed query we can obtain the Hibernate Query, and from that get the query string. This query string always represents the JPQL (technically, HQL) independent of whether the query was created from JPQL or from a Criteria:

String hqlQueryText= 
    typedQuery.unwrap(org.Hibernate.Query.class).getQueryString()

In order to parse this JPQL (HQL) query text we need to make use of the ASTQueryTranslatorFactory. Using this and the JPA EntityManagerFactory one can get hold of the SQL query text and a collection of parameters:

QueryTranslatorFactory translatorFactory = new ASTQueryTranslatorFactory();
    
QueryTranslator translator = translatorFactory.createQueryTranslator(
    hqlQueryText, hqlQueryText,
    EMPTY_MAP,
    (SessionFactoryImplementor) entityManagerFactory.unwrap(SessionFactory.class), 
    null
);

translator.compile(EMPTY_MAP, false);

After executing the above code the mentioned SQL query and parameters are available from the translator object. We'll first construct the counting query itself:

javax.persistence.Query nativeQuery = entityManager.createNativeQuery(
    "select count(*) from (" +
        translator.getSQLString() +
    ") x"   
);

And then set the parameters back:

ParameterTranslations parameterTranslations = translator.getParameterTranslations();

for (Parameter<?> parameter : typedQuery.getParameters()) {
    String name = parameter.getName();
    for (int position : parameterTranslations.getNamedParameterSqlLocations(name)) {
        nativeQuery.setParameter(
            position + 1, 
            typedQuery.getParameterValue(name)
        );
    }
}

Note that the +1 on the position is needed because of a mismatch between 0-based and 1-based indexing of both APIs.

With all this in place we can now finally execute the query and obtain the count:

Long cnt = ((Number) nativeQuery.getSingleResult()).longValue(); 

The casting here looks a big nasty. In the case of PostgreSQL a BigInteger was returned. I'm not entirely sure if this would be the case for all databases, hence the cast to Number first and then getting the long value from that.

Conclusion

Using the Hibernate specific API it's more or less possible to universally count the results of a query. It's not entirely perfect still, as values set on a JPQL query can often be richer than those set on a native query. For example, you can often set an entity itself as a parameter and the JPA provider will then automatically use the ID of that.

Furthermore using provider specific APIs when using JPA, especially for such an essential functionality, is just not so nice.

Finally, some providers such as EclipseLink do support subqueries in the select clause. For those providers no vendor specific APIs have to be used (and therefor there are no compile time concerns), but the code is of course still not portable.

If/when there will ever be a new JPA version again it would really be nice if the current problems with paging/sorting/filtering could be addressed.

Arjan Tijms

Comments

Popular posts from this blog

Implementing container authentication in Java EE with JASPIC

Jakarta EE Survey 2022

What’s new in Jakarta Security 3?