Philipp Hauer's Blog

Engineering Management, Java Ecosystem, Kotlin, Sociology of Software Development

Don't use In-Memory Databases (H2, Fongo) for Tests

Posted on Aug 21, 2017. Updated on Jun 12, 2022

At a first glance, in-memory databases (like H2 or Fongo) look like a good idea. You can test your code without having to worry about installing and managing a dedicated database server up front. Just start your tests and the H2 database will be up and running. However, this comfort comes with severe drawbacks. In this post, I explain my reservations and point out Docker as an alternative which can be easily used with TestContainers or within the Gradle/Maven build.

Don't use In-Memory Databases (H2, Fongo) for Tests


  • Using in-memory databases for tests reduce the reliability and scope of your tests. Your application’s SQL may fail in production against the real database, although the h2-based tests are green.
  • They provide not the same features as the real database. Possible consequences are:
    • You change the application’s SQL code just to make it run in both the real and the in-memory database. This may result in less effective, elegant, accurate or maintainable implementations. Or you can’t do certain things at all.
    • You skip the tests for some features completely.
  • Solution: Throw H2 and Fongo away and use a dockerized version of your real database instead. Docker simplifies the management of database instances.
  • Two options:
    • Use TestContainers and manage the database in your test code. I recommend this approach.
    • Integrate the container management into the build. Just start the database container, run the tests and stop it afterward. The example implementations for Gradle and Maven can be found below.
With H2, we are testing against a different database than we use in production. This can have a negative impact on the test reliability and the application implementation.

With H2, we are testing against a different database than we use in production. This can have a negative impact on the test reliability and the application implementation.

Drawbacks of In-Memory Databases for Tests

With in-memory databases, you are testing against a different database than your production database. This significantly reduces the meaning and reliability of your tests. A green h2-based test doesn’t mean that your application will also work against the real-world database. Let me give a very simple example from my experience:

  • Since MySQL 5.7.6, virtual is a keyword and can’t be used as an unquoted column name. By chance, we had a Hibernate entity with a property virtual. So after migrating to MySQL 5.7.6+, the request touching this entity will fail. Unfortunately, we execute our tests against an H2 database, so this mistake was not revealed!

But that’s why we are actually doing tests, right? To find those bugs. And we can only find them if we are as close to the (database) reality as possible.

Another even bigger issue is the different feature set of in-memory databases compared to the real database. From time to time, you need to use vendor-specific features. For me, this is fine, because those features often allow you to be more efficient, accurate, elegant or maintainable - or to do certain things at all. But they are usually not supported by in-memory databases:

  • For instance, we wanted to use MySQL’s unix_timestamp() function to be independent of the different time zones where our MySQL database servers are running. We want to prevent subtle inaccuracies. But this function doesn’t exist in H2, so we couldn’t use them.
  • Another example are triggers in Oracle database, which also doesn’t exist in H2 natively. There are only Java-based triggers. So we had to rewrite the trigger logic in Java.
  • Postgres has powerful features like window functions and JSON data types which are not supported by H2. Further Reading.
  • It’s the same story for Fongo, an in-memory-database for MongoDB. We already had several cases where the fongo returns different results than a real mongo (e.g. when it comes to sorting or filtering with complex elements in arrays). Especially the support for the aggregation framework is poor and therefore unusable.

What are our unpleasant options in cases of unsupported features?

  • If we are lucky, we can register java-based implementations for some of the missing functionality. For instance, in H2, we can register the missing MySQL unix_timestamp implementation or rewrite the Oracle trigger logic in Java. Drawbacks: First, we have additional implementation effort. Second, we don’t test the real database-side code anymore. The meaning of tests gets partly lost. Awful.
  • Don’t use the vendor-specific feature at all and change the production code to become H2-compatible. Usually, this leads to non-optimal solutions. Awful.
  • Don’t test the code using the vendor-specific feature. Awful.

Integration Tests in Production-like Environments

Especially the first kind of drawback (find bugs that only occur when executing against the real database) can be tackled by proper integration tests in a production-like environment/VM. That’s true and those end-to-end tests are really important. So you may need and have that kind of tests anyway. But we have to aware of the following points:

  • If you are lucky, the integration tests will show those database-specific errors. It depends on the coverage of the integration tests.
  • You only have a single layer of security, because your first test layer, the h2-based tests of your build, won’t find them.
  • Moreover, I personally prefer that those errors show up as early as possible in the build and delivery pipeline. I like to see those errors in the test phase of the project build.
  • Setting up a proper production-like environment and a delivery pipeline is required. This can be non-trivial.
  • Integration tests are only black-box tests and coarse-grained. They are slower and require more bootstrapping.
  • The more external services and resources are used by your application, the more fragile your integration tests can become.

Alternative: A Docker Container For The Database

Docker makes creating a real-world database so easy. Due to the standard management layer that Docker provides, we can easily utilize any database we want for our tests. But mind that you have to install Docker on both the developer’s machine and all CI servers. But for me, the increased safety and uncompromising implementation are definitely worth the effort.

Basically, there are two approaches:

  • Manage the database container in the test code (using TestContainers)
  • Manage the database container by the build (using Maven or Gradle)

Managed by the Test

By using TestContainers, we start the database container right in our Java test class. My playground project db-container-managed-by-the-test shows an example. The relevant part is the following:

public class MyTest {

    private static DataSource dataSource;
    private static MySQLContainer mysql;

    public static void init() throws InterruptedException{
        //You can also use the GenericContainer for arbitrary containers
        //But there are convenient classes for common databases.
        mysql = new MySQLContainer("mysql:5.5.53");
        dataSource = DataSourceBuilder.create()

    public static void destroy(){

    public void foo(){
        //use the database

Tip: You can reuse a single instance of a database container across multiple test classes by using an abstract test class with a static field for the container. But this depends on the implementation of the container class. See here for more details.


  • Really easy to set up.
    • No build integration required (no passing around of host, port, user and password).
    • TestContainers takes care of the logic to wait for the container startup before running the tests. This depends on the used container class, but you can implement your own wait logic.
    • TestContainers automatically searches for an open port.
  • No manual actions are required to execute the test in your IDE.


  • Slow feedback cycles during development. The container is started everytime you execute a test in your IDE. This takes some time (10 - 15 s), which can be annoying when you have to wait this time after every small change in the test code.
    • Fortunately, there is a solution for this: Just introduce a system property (like -DreuseDatabase). If set in the IDE’s run configuration (check this via System.getProperty()), the dataSource points to a database running on the developer’s machine. No container will be started. This database is manually started up front (e.g. via docker-compose) and reused for every test execution. This ensures a fast feedback cycle during the initial development.
  • It’s a little bit harder to debug a test because the container is shut down after the test has run.
    • Again, the above tip helps here.

TestContainers convinced me in practice, so I recommend this approach.

Managed by the Build

We can integrate the container management into our build. Both Maven and Gradle have nice plugins for managing Docker containers. Just configure your build to start the database container, run the tests and stop the container afterward.


  • Fast feedback cycles during development. The test execution time is not delayed by the startup time of the container. So you can change the test code, execute it in your IDE and get the feedback really fast. But this approach assumes that you have a running instance of the containers during development. That’s always true in my case.
  • It’s really easy to debug the test code and investigate the database after the test execution because the database is running all the time.


  • Yes, it’s slightly harder to set up those builds. You have to integrate the container management into your build, pass the required information (host and port) to the tests (e.g. as system properties) and wait until the container is ready before starting your tests. Moreover, you have to make sure a certain port is always free (which can be tricky on your CI servers and with parallel job execution). Alternatively, you can search for a free port in your build script.
  • You can’t execute the test directly in your IDE anymore. You have to start up the containers manually up front (e.g. via docker-compose). But during development, those containers are running anyway the whole time on my machine because I need them for the actual application. So the local test execution can reuse this container. And calling docker-compose up once a day manually is not a big deal.

Gradle Implementation

I created a simple project db-container-managed-by-gradle on GitHub. It shows how you can control a Docker container using the gradle-dcompose-plugin. The following parts of the build.gradle are required:

plugins {
    id "com.chrisgahlert.gradle-dcompose-plugin" version "0.9.1"

def mysqlTestPort = 3306
def mysqlTestPw = 'root'

dcompose {
    database {
        image = 'mysql:5.5.53'
        portBindings = ["$mysqlTestPort:3306"]
        env = ["MYSQL_ROOT_PASSWORD=$mysqlTestPw"]

test {
    dependsOn startDatabaseContainer
    finalizedBy stopDatabaseContainer
    doFirst {
        systemProperty 'mysql.port', mysqlTestPort
        systemProperty '', mysqlTestPw

It’s important to pass the host and port of the MySQL container as system properties to the Java tests. We can access the properties via System.getProperty("mysql.port", "3306"). A simple example test can be found in the sources.


# This will start the MySQL container, run the test and stop the container again.
./gradlew test

Maven Implementation

In the Maven world, there is the nice io.fabric8:docker-maven-plugin for controlling docker containers. I also created the small project db-container-managed-by-maven as a showcase. The important parts of the pom.xml are:




mvn verify