Hibernate
- 配置
- Hibernate二级缓存
- How does the second level-cache work?
- How does the query cache work?
- What is the relation between the two caches?
- How to setup the two caches in an application
- Using the second level cache
- Using the query cache
- Pitfall 1 - Query cache worsens performance causing a high volume of queries
- Pitfall 2 - Cache limitations when used in conjunction with @Inheritance
- Pitfall 3 - Cache settings get ignored when using a singleton based cache
- Conclusion
- Useful Links
- SQL日志
配置
Hibernate二级缓存
How does the second level-cache work?
The second level cache stores the entity data, but NOT the entities themselves. The data is stored in a 'dehydrated' format which looks like a hash map where the key is the entity Id, and the value is a list of primitive values.
Here is an example on how the contents of the second-level cache look:
*-----------------------------------------* | Person Data Cache | |-----------------------------------------| | 1 -> [ "John" , "Q" , "Public" , null ] | | 2 -> [ "Joey" , "D" , "Public" , 1 ] | | 3 -> [ "Sara" , "N" , "Public" , 1 ] | *-----------------------------------------*
The second level cache gets populated when an object is loaded by Id from the
database, using for example entityManager.find()
, or when traversing lazy
initialized relations.
How does the query cache work?
The query cache looks conceptually like an hash map where the key is composed by the query text and the parameter values, and the value is a list of entity Id's that match the query:
*----------------------------------------------------------* | Query Cache | |----------------------------------------------------------| | ["from Person where firstName=?", ["Joey"] ] -> [1, 2] ] | *----------------------------------------------------------*
Some queries don't return entities, instead they return only primitive values. In those cases the values themselves will be stored in the query cache. The query cache gets populated when a cacheable JPQL/HQL query gets executed.
What is the relation between the two caches?
If a query under execution has previously cached results, then no SQL statement is sent to the database. Instead the query results are retrieved from the query cache, and then the cached entity identifiers are used to access the second level cache.
If the second level cache contains data for a given Id, it re-hydrates the entity and returns it. If the second level cache does not contain the results for that particular Id, then an SQL query is issued to load the entity from the database.
How to setup the two caches in an application
The first step is to include the hibernate-ehcache
jar in the classpath:
<dependency> <groupId>org.hibernate</groupId> <artifactId>hibernate-ehcache</artifactId> <version>SOME-HIBERNATE-VERSION</version> </dependency>
The following parameters need to be added to the configuration of your
EntityManagerFactory
or SessionFactory
:
<prop key="hibernate.cache.use_second_level_cache">true</prop> <prop key="hibernate.cache.use_query_cache">true</prop> <prop key="hibernate.cache.region.factory_class">org.hibernate.cache.ehcache.EhCacheRegionFactory</prop> <prop key="net.sf.ehcache.configurationResourceName">/your-cache-config.xml</prop>
Prefer using EhCacheRegionFactory
instead of SingletonEhCacheRegionFactory
.
Using EhCacheRegionFactory means that Hibernate will create separate cache
regions for Hibernate caching, instead of trying to reuse cache regions
defined elsewhere in the application.
The next step is to configure the cache regions settings, in file
your-cache-config.xml
:
<?xml version="1.0" ?> <ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" updateCheck="false" xsi:noNamespaceSchemaLocation="ehcache.xsd" name="yourCacheManager"> <diskStore path="java.io.tmpdir"/> <cache name="yourEntityCache" maxEntriesLocalHeap="10000" eternal="false" overflowToDisk="false" timeToLiveSeconds="86400" /> <cache name="org.hibernate.cache.internal.StandardQueryCache" maxElementsInMemory="10000" eternal="false timeToLiveSeconds="86400" overflowToDisk="false" memoryStoreEvictionPolicy="LRU" /> <defaultCache maxElementsInMemory="10000" eternal="false" timeToLiveSeconds="86400" overflowToDisk="false" memoryStoreEvictionPolicy="LRU" /> </ehcache>
If no cache settings are specified, default settings are taken, but this is
probably best avoided. Make sure to give the cache a name by filling in the
name
attribute in the ehcache
element.
Giving the cache a name prevents it from using the default name, which might already be used somewhere else on the application.
Using the second level cache
The second level cache is now ready to be used. In order to cache entities,
annotate them with the @org.hibernate.annotations.Cache
annotation:
@Entity @Cache(usage=CacheConcurrencyStrategy.READ_ONLY, region="yourEntityCache") public class SomeEntity { ... }
Associations can also be cached by the second level cache, but by default this
is not done. In order to enable caching of an association, we need to apply
@Cache
to the association itself:
@Entity public class SomeEntity { @OneToMany @Cache(usage=CacheConcurrencyStrategy.READ_ONLY, region="yourCollectionRegion") private Set<OtherEntity> other; }
Using the query cache
After configuring the query cache, by default no queries are cached yet. Queries need to be marked as cached explicitly, this is for example how a named query can be marked as cached:
@NamedQuery(name="account.queryName", query="select acct from Account ...", hints={ @QueryHint(name="org.hibernate.cacheable", value="true")} )
And this is how to mark a criteria query as cached:
List cats = session.createCriteria(Cat.class) .setCacheable(true).list();
The next section goes over some pitfalls that you might run into while trying to setup these two caches. These are behaviors that work as designed but still can be surprising.
Pitfall 1 - Query cache worsens performance causing a high volume of queries
There is an harmful side-effect of how the two caches work, that occurs if the cached query results are configured to expire more frequently than the cached entities returned by the query.
If a query has cached results, it returns a list of entity Id's, that is then resolved against the second level cache. If the entities with those Ids where not configured as cacheable or if they have expired, then a select will hit the database per entity Id.
For example if a cached query returned 1000 entity Ids, and non of those entities where cached in the second level cache, then 1000 selects by Id will be issued against the database.
The solution to this problem is to configure query results expiration to be aligned with the expiration of the entities returned by the query.
Pitfall 2 - Cache limitations when used in conjunction with @Inheritance
It is currently not possible to specify different caching policies for different subclasses of the same parent entity.
For example this will not work:
@Entity @Inheritance @Cache(CacheConcurrencyStrategy.READ_ONLY) public class BaseEntity { ... } @Entity @Cache(CacheConcurrencyStrategy.READ_WRITE) public class SomeReadWriteEntity extends BaseEntity { ... } @Entity @Cache(CacheConcurrencyStrategy.TRANSACTIONAL) public class SomeTransactionalEntity extends BaseEntity { ... }
In this case only the @Cache
annotation of the parent class is considered,
and all concrete entities have READ_ONLY
concurrency strategy.
Pitfall 3 - Cache settings get ignored when using a singleton based cache
It is advised to configure the cache region factory as a EhCacheRegionFactory
, and specify an ehcache configuration via
net.sf.ehcache.configurationResourceName
.
There is an alternative to this region factory which is
SingletonEhCacheRegionFactory
. With this region factory the cache regions
are stored in a singleton using the cache name as a lookup key.
The problem with the singleton region factory is that if another part of the
application had already registered a cache with the default name in the
singleton, this causes the ehcache configuration file passed via
net.sf.ehcache.configurationResourceName
to be ignored.
Conclusion
The second level and query caches are very useful if set up correctly, but there are some pitfalls to bear in mind in order to avoid unexpected behaviors. All in all it's a feature that works transparently and that if well used can increase significantly the performance of an application.
Please let us know in the comments bellow your own experience and pitfalls you have encountered. Thanks for reading.
Useful Links
This blog post is a well-known reference to the inner details of the Hibernate second level and query caches - Truly Understanding the Second-Level and Query Caches
SQL日志
需要注意的是,Hibernate只记录从它发送到JDBC的准备语句(prepared statement)
及参数。准备语句使用?
作为查询参数的占位符,这些参数的实际值被记录在准备语句的
下方。
这些准备语句和最终发送到数据库的sql语句是不同的,对于这些最终的查询操作Hibernate 无法记录。出现这种情况的原因是Hibernate只知道它发送给JDBC的准备语句和参数, 实际的查询是由JDBC构建并发送给数据库的。
为了产生实际查询的日志,像log4jdbc这种工具是必不可少的,这里不会讨论如何使用 log4jdbc。参看:log4jdbc
如何找到原始查询操作
上述的可记录查询包含一条标注,在大多数情况下它可以标识某条起始查询语句。如果
一条查询是由加载引起的,那么标注便是/*load your.entity.Name*/
。如果是一条命名
查询,那么标注则包含查询的名称。
如果它是一个对应许多延迟加载的查询,标注则会包含对应类的名称和引发该操作的 属性值等。
设置Hibernate的查询日志
为了获得查询日志,需要将如下标签加入会话工厂的配置文件中:
<bean id= "entityManagerFactory" > ... <property name="jpaProperties" > <props> <prop key="hibernate.show_sql" >true</ prop> <prop key="hibernate.format_sql" >true</ prop> <prop key="hibernate.use_sql_comments">true</prop> </props> </property>
上面的示例展示了Spring实体管理工厂的配置。下面是对一些标签的解释:
-
show_sql
:激活查询日志功能。 -
format_sql
:优雅地输出Sql。 -
use_sql_comments
:添加一条解释型标注。
为了记录查询语句的参数信息,log4j或者相对应的信息是需要的。
<logger name="org.hibernate.type"> <level value="trace" /> </logger >
如果上述功能都不能运行
在大多数情况下,use_sql_comments创建的标注是足够用来标识查询的起始。但如果 这还不够,我们可以标识和数据表名相关联的查询返回的实体,并在返回的实体构造函数中 设置断点。
如果一个实体没有构造函数,我们可以创建一个构造函数并把断点设置在super()函数 调用中。
@Entity public class Employee { public Employee() { super(); // put the breakpoint here } ... }
设置断点后,跳转到包含程序堆栈信息的Debug界面并从头到尾执行一遍。这样在调用栈中 将会出现查询操作在何处被创建。