Development Hints from Bulgaria: JPA Cache behavior with JPQL Update and Delete queries

JPA experience desynchronization between read data and cache while executing modifying queries. Even in the same application. It is helpful to know JPA behavior to avoid cache issues.

In short

Update queries: Lets have an update statement, which will update all rows in the database (no where cause). It will be executed on the database itself only - updating first level cache and second level cache will not be executed. Next time you access some row via that cache, you will get the old value (if the cache has not been invalidated because of timeout).

Delete queries: deleting all rows with JPQL will result removing them from database but they will stay in both caches.

1st Level Cache

If you bypass JPA and execute DML directly on the database, either through native SQL queries, JDBC, or JPQL UPDATE or DELETE queries, then the database can be out of synch with the 1st level cache. If you had accessed objects before executing the DML, they will have the old state and not include the changes. Depending on what you are doing this may be ok, otherwise you may want to refresh the affected objects from the database.

The 1st level, or EntityManager cache can also span transaction boundaries in JPA. A JTA managed EntityManager will only exist for the duration of the JTA transaction in JEE. Typically the JEE server will inject the application with a proxy to an EntityManager, and after each JTA transaction a new EntityManager will be created automatically or the EntityManager will be cleared, clearing the 1st level cache. In an application managed EntityManager, the 1st level cache will exist for the duration of the EntityManager. This can lead to stale data, or even memory leaks and poor performance if the EntityManager is held too long. This is why it is generally a good idea to create a new EntityManager per request, or per transaction. The 1st level cache can also be cleared using the EntityManager.clear() method, or an object can be refreshed using the EntityManager.refresh() method.

2nd Level Cache

If the application is the only application and server accessing the database there is little issue with the 2nd level cache, as it should always be up to date. The only issue is with DML, if the application executes DML directly to the database through native SQL queries, JDBC, or JPQL UPDATE or DELETE queries. JPQL queries should automatically invalidate the 2nd level cache, but this may depend on the JPA provider. If you use native DML queries or JDBC directly, you may need to invalidate, refresh or clear the objects affected by the DML.

If there are other applications, or other application servers accessing the same database, then stale data can become a bigger issue. Read-only objects, and inserting new objects should not be an issue. New objects should get picked up by other servers even when using caching as queries typically still access the database. It is normally only find() operations and relationships that hit the cache. Updated and deleted objects by other applications or servers can cause the 2nd level cache to become stale.

For deleted objects, the only issue is with find() operations, as queries that access the database will not return the deleted objects. A find() by the object's Id could return the object if it is cached, even if it does not exist. This could lead to constraint issues if you add relations to this object from other objects, or failed updates, if you try to update the object. Note that these can both occur without caching, even with a single application and server accessing the database. During a transaction, another user of the application could always delete the object being used by another transaction, and the second transaction will fail in the same way. The difference is the potential for this concurrency issue to occur increases.

For updated objects, any query for the objects can return stale data. This can trigger optimistic lock exceptions on updates, or cause one user to overwrite another user's changes if not using locking. Again note that these can both occur without caching, even with a single application and server accessing the database. This is why it is normally always important to use optimistic locking. Stale data could also be returned to the user.

Refreshing

Refreshing is the most common solution to stale data. Most application users are familiar with the concept of a cache, and know when they need fresh data and are willing to click a refresh button. This is very common in an Internet browser, most browsers have a cache of web pages that have been accessed, and will avoid loading the same page twice, unless the user clicks the refresh button. This same concept can be used in building JPA applications. JPA provides several refreshing options, see refreshing.

Some JPA providers also support refreshing options in their 2nd level cache. One option is to always refresh on any query to the database. This means that find() operations will still access the cache, but if the query accesses the database and brings back data, the 2nd level cache will be refreshed with the data. This avoids queries returning stale data, but means there will be less benefit from caching. The cost is not just in refreshing the objects, but in refreshing their relationships. Some JPA providers support this option in combination with optimistic locking. If the version value in the row from the database is newer than the version value from the object in the cache, then the object is refreshed as it is stale, otherwise the cache value is returned. This option provides optimal caching, and avoids stale data on queries. However objects returned through find() or through relationships can still be stale. Some JPA providers also allow find() operation to be configured to first check the database, but this general defeats the purpose of caching, so you are better off not using a 2nd level cache at all. If you want to use a 2nd level cache, then you must have some level of tolerance to stale data.

Source: https://en.wikibooks.org/wiki/Java_Persistence/Caching

Development Hints from Bulgaria

Thursday, December 3, 2015

JPA Cache behavior with JPQL Update and Delete queries

In short

1st Level Cache

2nd Level Cache

Refreshing

No comments:

Post a Comment