Philipp Hauer's Blog

Engineering Management, Java Ecosystem, Kotlin, Sociology of Software Development

Do-It-Yourself ORM as an Alternative to Hibernate

Posted on Oct 5, 2017. Updated on Jan 20, 2019

Hibernate is my daily business. And it bugs me. Hibernate adds non-trivial complexity to your application and restricts the flexibility in terms of the query capabilities and the class design. Fortunately, there are many alternatives available. In this post, I like to recap some drawbacks of Hibernate and present an alternative: Do-it-yourself ORM with plain SQL, Spring’s JdbcTemplate and compact mapping code powered by Kotlin.

Do-It-Yourself ORM as an Alternative to Hibernate

TL;DR

  • Complexity
    • The Hibernate stack introduces significant complexity into your application
    • Performance or caching issues are likely to occur and can be hard to track.
    • Debugging can be really tough.
  • Reduced flexibility
    • Restricted expressiveness (limiting HQL abstraction, no database-specific SQL features useable)
    • Restricted class design (No immutable classes, class design based on relational schema)
  • Alternatives: Plain SQL + Spring’s JdbcTemplate + Do-It-Yourself object mapping. Java 8 and (even better) Kotlin provide powerful means to write compact mapping code. KISS.

Introduction and Disclaimer

This post reflects my personal experience of my daily working life. I’m maintaining a big legacy application which completely relies on Hibernate. It has a complex domain model with many references, scales horizontally and uses a distributed self-replicating second-level cache. For me, it’s a pain. It’s totally fine, if you are not making the same experiences with Hibernate. In fact, I envy you. I just want to point out that there are other means for accessing a relational database in Java. I denounce the Hibernate reflex in our Java community.

Disadvantages of Hibernate

The Complex Stack

With Hibernate, you add non-trivial complexity to your application. This can be crucial if things are not working as you expect. Debugging or tracing bugs or performance issues can be really time-consuming. Stepping through the complex and huge stack is no fun at all. Finding out which concrete SQL statements (with resolved placeholders) is also not easy. Understanding and optimizing the generated nasty SQL queries can be extremely hard. Moreover, it’s really easy to run into performance issues (like lazy/eager fetching problems, sessions or due to Hibernate’s overhead in general).

Hibernate provides a multi-layered caching mechanism, which is powerful, but very complex and can also lead to severe headaches (cache drifts, outdated states, can’t be mixed easily with native queries, distributed self-replicated second-level caches like Ehcache).

Hibernate may provide a solution for every mentioned problem, but you have to deeply dive into Hibernate and understand how it works internally. This can be very time-consuming. And some problems wouldn’t even exist without Hibernate. The price for the convenience provided by Hibernate is high.

“I had to learn Hibernate architecture, configuration, logging, naming strategies, tuplizers, entity name resolvers, enhanced identifier generators, identifier generator optimization, union-subclasses, XDoclet markup, bidirectional associations with indexed collections, ternary associations, idbag, mixing implicit polymorphism with other inheritance mappings, replicating object between two different datastores, detached objects and automatic versioning, connection release modes, stateless session interface, taxonomy of collection persistence, cache levels, lazy or eager fetching and many, many more.” Grzegorz Gajos in How Hibernate Almost Ruined My Career

Restricted Expressiveness

Hibernate adds HQL as an additional abstraction layer over SQL. First, you have to learn another language. But the real issues is the following: HQL doesn’t provide all features of SQL (like a simple limit). Moreover, you can’t use database-specific SQL statements (like MySQL’s INSERT IGNORE or useful functions like UNIX_TIMESTAMP()); at least without using native queries having other drawbacks like cache flushes or poor maintainability. Besides, it can be though to optimize a query, because you can’t directly influence the generated SQL and have to do it through the HQL abstraction (for instance in case of performance issues or when Hibernate generates a query that is not compatible with MySQL 5.7.5’s sql_mode ‘ONLY_FULL_GROUP_BY’).

So the HQL abstraction severely restricts your means. You can’t use the direct and best SQL statement to solve your problem anymore (which may require some SQL-feature that are not supported by HQL or vendor-specific ones). This easily leads to workarounds which are either slow, hard to maintain or not understandable.

Restricted Class Design

Hibernates leads to nasty domain design in your application layer. It imposes multiple constraints on your classes:

  • Entity classes have to be mutable because a default constructor is required and you can’t use final fields. You most probably also have to write setter (if you want to avoid Hibernate’s reflection magic). So it’s possible to create objects with an inconsistent state (and it’s a matter of time until such issues will occur in production). Besides, you can run into other issues related to mutability (subtle bugs due to side-effects or concurrency; poor testability)
  • The relational schema is directly visible in your class design because you have to create a class for each table and rebuild the foreign keys with field references. Even worse, in case of composite keys, you often have to create an entity class for the intermediate tables. Your class design and composition is really restricted. For instance, you can’t simply join multiple tables together and map it to a dedicated tailored class, which may perfectly fit the use case of your application layer.

By the way: If you model your domain starting from the object-oriented perspective (with many objects referencing each other), you most likely end up with multiple tables and many foreign keys and suffering under a poor performance.

All in all, the usage of Hibernate obstructs the natural and optimal class design. And Hibernate definitely doesn’t solve the impedance mismatch between the relational and the object-oriented world. Your class composition is just getting more relational.

Alternatives

  • Maybe relational databases are not a proper solution for your domain model at all. Other databases (like MongoDB) fit much better to the object-oriented world. So there is no impedance mismatch that has to be tackled by an additional and complex layer like Hibernate. Thus, the mapping logic is much more straight-forward and less error-prone. Spring Data MongoDB does a pretty good job here. For details, see my post Why Relational Databases are not the Cure-All.
  • Query DSLs like jOOQ and QueryDSL for Java or Exposed for Kotlin. These libraries let you write queries with a type-safe Java or Kotlin DSL. They are definitely worth to check out. I personally don’t like the additional complexity they introduce in the build process. Moreover, the DSL is again an additional restricting layer (like HQL). First, you have to learn it. Second, you may end up in situations where you want to use special SQL statements or database-specific features that are not (easily) provided by the DSL.
  • Execute plain SQL and do the object mapping manually. With the help of Java 8’s or Kotlin’s features, the mapping code is really compact. Moreover, Spring’s JdbcTemplate takes care about the handling of connections and transactions and provide a nice API to embed your mapping logic. Alternatively, you can use JDBI, which is slightly more high-level.

As I’m a fan of the third option, I like to give you a simple example for this approach.

Do-It-Yourself ORM with Spring JdbcTemplate and Kotlin

First of all, let’s define the immutable model class User with Kotlin’s powerful data classes.

// immutable
data class User(
        val id: Int,
        val email: String,
        val name: String?,
        val role: Role,
        val dateCreated: Instant,
        val state: State
)
enum class Role { USER, GUEST }
enum class State { ACTIVATED, DEACTIVATED, DELETED }

Let’s write the UserDAO.

class UserDAO(private val template: JdbcTemplate) {

    fun findAllUsers() = template.query("SELECT * FROM users;", this::mapToUser)

    fun findUser(id: Int) = try {
        template.queryForObject("SELECT * FROM users WHERE id = $id;", this::mapToUser)
    } catch (e: EmptyResultDataAccessException) {
        null
    }

    private fun mapToUser(rs: ResultSet, rowNum: Int) = User(
            id = rs.getInt("id")
            , email = rs.getString("email")
            , name = mergeNames(rs) // execute custom mapping logic
            , role = if (rs.getBoolean("guest")) Role.GUEST else Role.USER //understandable and direct type conversion
            , dateCreated = rs.getTimestamp("date_created").toInstant()
            , state = State.valueOf(rs.getString("state"))
    )
}

As you see, mapping objects manually is no magic at all. In fact, Kotlin’s features (try/catch is an expression, single expression functions, named arguments) make the mapping code compact and readable. The Java 8 equivalent won’t be that compact, but still good enough.

  • We have full control over the mapping process. This provides maximal flexibility. We can directly use every SQL feature (standard or database-specific) we like. Performance optimizations can be easily applied. Nothing restricts us. No need to learn another query language.
  • The database access layer is very thin and simple. Debugging won’t be a problem.
  • The type conversions are really straight-forward: Just code it (like the mapping from the boolean column guest to the enum Role).
  • The JdbcTemplate deals with the low-level details like transaction or connection handling.
  • Our class design is not restricted by the mapping layer or the underlying tables. So we can easily use immutable data! Or we can execute a join and write the result set to a single tailored class.
  • The code is easy to read and to understand. No reflection magic happens.

Objections against DIY-ORM

You may raise some arguments against that approach.

You have to write more code

Yes, but Kotlin and Java 8 make writing mapping code really straightforward. But still, it’s more writing. That’s the price for the increased flexibility and control. For me, it’s absolutely worth it. I rather spend an hour writing the required mapping code than a day for hunting a bug or performance issue in the Hibernate stack.

Due to the SQL strings, it’s easier to introduce avoidable bugs (typos, mixed up arguments. syntax errors)

That’s true, but you should write tests for the DAOs anyway.

You shouldn’t use database-specific features

  • Why? Because you can’t exchange the database anymore? Come on. Moving from one relational database to another one is unlikely in practice. This never happened to me. So database-specific features are a small risk that I would always take. Besides, switching databases comes with much bigger challenges than just rewriting queries.
  • Moreover, using those features is an advantage for me - not a disadvantage. Vendor-specific SQL features can lead to way better solutions and avoid unmaintainable or slow workarounds.
  • Even if you stick to the SQL standard, the databases can still behave differently, which leads to issue that are hard to track.

What about caches?

Hibernates caches are powerful. But they also add non-trivial complexity to your code. So measure first. Find out if you really have a performance issue that justifies the introduction of a caching layer. Mind that caching can be tricky. Just think about cache drifts, possible inconsistencies or a proper approach for cache invalidation. If you still need it, consider to only add caches to the hot spots of your application. Solutions can be a Guava cache or Caffeine for local caches or Redis and Hazelcast for cluster setups. A schema change can also improve the performance and remove the need for a cache.

Further Reading