Direct naar content

MongoDB or PostgreSQL for a web application

PostgreSQL, like all SQL databases (including SQLite), is a relational database. PostgreSQL specializes in keeping track of relationships between pieces of data and helps you look up data when you know something related to it.

For example, suppose you have two database tables, “Users” for all users and “Posts” for all posts done by a user, with “Posts.user_id” as a foreign key reference to “User. id”. You have a schema where 1 Post can have only 1 User, and logically, 1 User can have multiple Posts (because Users itself has no relationship with Posts, the relationship is the other way around). SQL makes it very easy to work with such relationships. You can say “give me all the Posts of a certain User”, or “give me the User of this Post”.

Martijn Wallet

Principal DBA Consultant
Ellert van Koperen

MongoDB or PostgreSQL?

All SQL databases can handle relational queries as described above, but PostgreSQL is very good at it. However, there is also one aspect that PostgreSQL is not very good at handling: graphs. We’ll come back to that later.

The Users and Posts relationship above is an example of a “one-to-many” relationship (or “many-to-one” depending on which way you look at it). There is also something called a “many-to-many” relationship, which is based on a combination of a “many-to-one + one-to-many” relationship.

In our hypothetical example of Users and Posts, we could argue that a Post has more than 1 User by defining the following: “Users Posts” where this new “Users_Posts” table has two columns, 1 with a foreign key relationship to a record in Users, and another with a foreign key relationship to a record in Posts. For each unique combination of User and Post, a record can be added to the Users_Posts table connecting the user and the post.

Here is another common situation where you need such a model: Plotting Films with their Actors. You then have a table for Films and Actors, as well as a table connecting them.

Let’s say you want to know in which movies actors A and B play together. That’s easy:

SELECT * FROM Films

WHERE id IN (SELECT film_id FROM Films_Acteurs WHERE acteur_id IN (B))

AND id IN (SELECT film_id FROM Films_Acteurs WHERE acteur_id IN (B));

Nested queries

Did you see what we did? These are called nested queries. Starting at the inside, we asked for all the movie_ids in the join table that contain A and that contain B, then we asked for all the details from the Movies table for the movie_ids found in the previous one.

But what if what if A and B have never been in a movie together but you want to know if they have matching co-actors? This is how LinkedIn and Facebook work, where you search for a profile of someone who is not a friend, but LinkedIn and Facebook look for common friends.

This is a graph search pattern and something for a totally different type of database called graph database. For example, Neo4J is such a type of database. Although graph data can be modeled with a relational database such as PostgreSQL, you are better off using a database type designed for that purpose such as Neo4J.

MongoDB

And then there’s MongoDB. So what is it for? MongoDB is not a graph database, or relational database. like CouchDB, it is a document database and represents the opposite. What would you do if you want to store all the details of a movie in 1 place and have no intention at all of comparing the data with each other.A relational database wants you to “normalize” your data by storing every little detail in a separate table, but you might find that irritating.

You just want to store data in 1 place without having to bother yourself too much. You’re still trying to figure out what data you actually want to store, so for now you just send it somewhere and you can worry about querying/querying it later. That’s exactly what document databases are for, and MongoDB is good at that.

Nowadays PostgreSQL has many stable extensions available such as store, arrays and JSON datatype for columns, so one can also implement document-oriented features in PostgreSQL. PostgreSQL is stable and fast. The choice also depends on programming language used. Some libraries support Postgres, but not Mongo. So when would MongoDB really be the better choice?

Most answers to this question overshoot the most important aspect. PostgreSQL has very good native key/value and JSON support, so if you want to go “schemaless” and apply data modeling only when you already have a “heap of data,” you could just use an HStore/JSON column and store data in it.

You can additionally use schemas on other tables and avoid headaches about using multiple different database platforms. PostgreSQL introduced the JSON data type in version 9.2 and the more efficient JSONB type in 9.4. So you can use a mix of relational and document data in PostgreSQL itself.

And then another thing … High availability is not the same as a write speed, in fact they do not go together. Super fast “write throughputs” can be achieved with MongoDB by deferring writing to disk until a later time. That allows for data loss, the opposite of availability. PostgreSQL does not have these features and may not be of interest if you do not value durability of your data.

As of version 9.4 PostgreSQL benchmarks show it to be faster than MongoDB for both writing and reading JSON data. For this, see also Postgres Outperforms MongoDB and Ushers in New Developer Reality (although there is no doubt that both databases can be further tuned for specific scenarios).

Here is also information about a choice in this: