PostgreSQL and MySQL are two of the most popular open-source databases available today. They both provide the database backend for an untold number of web applications, enterprise software packages, and data science projects. The two databases share some similarities in that they both adhere to the SQL standard.
However, there are some key differences that might influence your decision to choose one over the other. PostgreSQL is known for its advanced features and for being impressively durable and scalable. MySQL is well-known for its ease of use and speed in read/write operations.
Here’s an overview of their similarities and differences, including their architectures, data types, indexing schemes, security, and performance.
PostgreSQL and MySQL: the similarities
Both PostgreSQL (also known as “Postgres”) and MySQL are Relational Database Management Systems (RDBMS). That means both store data in rows and tables, have a mechanism to define the relationships between the data in the tables, and provide the Structured Query Language (SQL) to access the data via standardized queries.
Both database systems are ACID-compliant. ACID (atomicity, consistency, isolation, durability) compliance ensures data consistency and integrity, even in the face of system errors, hardware failures, and power outages. Both support replication for adding more servers to host data with fault tolerance and a distributed workload.
MySQL and PostgreSQL are both free and open source, meaning that anyone can obtain the source code, install the software, and modify it how they see fit. Both offer tight integration with web servers like Apache and programming languages like PHP and Python.
Architectural differences and data types
While both MySQL and PostgreSQL are examples of an RDBMS, PostgreSQL also qualifies as an Object-Relational Database Management System, or ORDBMS. This means that Postgres has the typical characteristics of a relational database, and it’s also capable of storing data as objects.
At a high level, objects in software development are models with attributes and properties that can be accessed with forms of code known as procedures and methods.
To see the difference, have a look at the supported data types in both systems. MySQL supports a set of standard data types, including VARCHAR (text fields limited to a certain length), TEXT (free-form text), INTEGER (an integer number), BOOLEAN (a true/false field), and DATE (a timestamp). Meanwhile, PostgreSQL supports the standard data types along with a wide range of more complex data types not seen in a traditional RDBMS. This includes MONEY (a currency amount), INET (IP addresses), MACADDR (a network device’s MAC address), and many other specialized objects.
Perhaps most importantly, Postgres supports the JSON and JSONB data types, which are JSON text and binary JSON data. As a majority of REST web service APIs today transfer data in JSON format, this makes PostgreSQL a favorite of app developers and system administrators. While MySQL can be made to store JSON text, the ability to natively query stored JSON data is a major advantage of PostgreSQL.
MySQL and PostgreSQL query languages
PostgreSQL supports the creation of custom data models with its PL/pgSQL query language, which is substantially more full-featured than MySQL’s standard SQL implementation.
In essence, PL/pgSQL can be seen as both a query language and a procedural programming language. PL/pgSQL supports programming constructs like loops, conditional statements, variables, and error handling. The language also makes it easy to implement user-defined functions and stored procedures in queries and scripts.
MySQL’s SQL implementation lacks these features and is best suited for simple queries, along with data sorting and exporting.
Even though PL/pgSQL is unique to PostgreSQL, it actually has a stricter adherence to SQL standards than MySQL’s SQL implementation. Advanced SQL features like window functions and common table expressions (CTEs) are available in PostgreSQL but not MySQL.
Scalability differences between MySQL and PostgreSQL
Both PostgreSQL and MySQL are capable of scaling to handle large amounts of data and high levels of traffic and to support complex applications. However, scaling MySQL typically involves adding more hardware and database instances, while PostgreSQL has some advanced features that naturally support scaling.
PostgreSQL uses a system called MVCC (Multiversion Concurrency Control) that allows multiple users to access and modify data simultaneously without locking out or slowing down each other’s queries like MySQL. This is particularly helpful for applications requiring high read/write activity levels.
When adding additional servers, MySQL uses binary log-based replications, which is fast but can lead to data inconsistencies when network hiccups interrupt replication activities. PostgreSQL uses the “log-shipping” approach, which is more reliable but can be slower than binary log replication. But PostgreSQL also supports table partitioning, which allows a single table to be spread across multiple smaller tables. This tends to improve performance because smaller amounts of data are queried at one time.
PostgreSQL also has a more advanced query optimizer than MySQL, which helps queries to execute more efficiently. PostgreSQL also sports a larger maximum table size than MySQL, making it better suited for applications with large datasets.
PostgreSQL and MySQL take different approaches to security. Both have mechanisms for granting access to schemas and tables to defined users, but PostgreSQL offers more advanced features.
PostgreSQL has a fine-grained approach to user privileges, allowing administrators to assign more specific privileges to users and roles. MySQL, on the other hand, uses a more broad and more basic authorization system with a combination of user accounts and global or database-specific privileges. PostgreSQL supports many authentication methods beyond the simple username and password combination. This includes authenticating against an LDAP server or Active Directory and certificate-based authentication.
Both systems support encryption, with PostgreSQL offering more options. In particular, PostgreSQL supports column-level encryption and a feature known as Transparent Data Encryption (TDE). With TDE, all data in a schema is encrypted using a symmetric encryption key. This key, in turn, is protected by a master key that can be stored in a software key management system or a hardware-based security module.
MySQL uses SSL (Secure Sockets Layer) to help ensure data integrity, which makes it a popular database for web applications. Beyond that, MySQL doesn’t offer as many security and encryption features as PostgreSQL. But that doesn’t mean it’s insecure. Through the judicious use of strong passwords and network-level security, a MySQL installation can be secured well enough to meet enterprise standards.
An RDBMS’s transaction methodology ensures data consistency and integrity while playing a large part in the database’s overall performance. The speed at which transactions are performed defines whether a database system suits a particular task.
Since both PostgreSQL and MySQL are ACID-compliant, both support transaction rollbacks and commits. However, MySQL does not enable transactions by default, opting for “auto-commit” mode out of the box. This means each SQL statement is automatically committed or rolled back unless this setting is changed.
MySQL uses a locking mechanism that is optimized for performance but can lead to inconsistencies in some cases. PostgreSQL uses a strict locking mechanism for a higher level of consistency.
MySQL first gained popularity in Web 1.0 days, in part because it’s open source and works well with other free and open-source software such as the PHP language and operating systems built on the Linux kernel. A strong community has built up around MySQL over time, making it one of the most popular open-source packages of all time.
The well-known acronym LAMP—standing for Linux, Apache, MySQL, and PHP (or Perl, or Python)—came from this community, in honor of the free software packages that have powered many dynamic websites for decades.
MySQL was created by Swedish developers Michael Widenius and David Axmark in 1995. A year later, the two founded the company MySQL AB to provide commercial support and consulting service for the database as it grew in popularity. In 2008, Sun Microsystems acquired MySQL AB for $1 billion. Two years later, Sun was acquired by Oracle Corporation, which means MySQL is owned by the tech giant.
This raised concerns in the open-source community that Oracle would prioritize its own proprietary RDBMS solutions over MySQL. These fears have mostly been unfounded, as Oracle continues to develop MySQL and offer it under the GNU General Public License (GPL), making it free for personal and non-commercial use. However, the GPL allows Oracle to charge for commercial uses of MySQL, which makes some in the community no longer consider MySQL to truly be “free and open source.”
In response to these concerns, a community-supported version of MySQL has emerged called MariaDB. While identical to MySQL in basic form and function, MariaDB lacks some of MySQL’s advanced features.
PostgreSQL is released under a modified version of the MIT license known as the PostgreSQL License. This is a permissive free and open-source license, allowing users a great deal of flexibility in how they can use and modify the software.
As a result, PostgreSQL remains one of the most popular open-source databases in the world, with a large community support base of many users, enterprise admins, and application developers. However, there tend to be more community contributions to the MySQL and MariaDB ecosystems.
MySQL is utilized by an untold number of websites thanks in part to the database being free and open source, as well as its out-of-the-box support for the PHP language. The combination of PHP and MySQL helped create a rush of dynamic websites that didn’t have their HTML code manually updated.
Early on, Google used MySQL for its search engine. Over time, as the search giant’s dataset grew, it moved to different database technologies optimized for unstructured data and fuzzy searches. (Today, Google search is powered by Google’s own distributed data storage system, Bigtable.)
MySQL is still widely used for many small- to medium-sized web applications. Content management systems and specialized web apps like Geographic Information Systems (GIS) almost always support MySQL as a database backend.
Many enterprises also use it as the data backend for their internal applications and data warehouses. PostgreSQL is used in many of the same scenarios. Most web apps that support MySQL will also support PostgreSQL, making the choice a matter of preference for sysadmins and database administrators.
PostgreSQL: pros and cons
Here are some of the pros of choosing PostgreSQL:
- Performance and scalability that matches commercial RDBMS products.
- Concurrency support for multiple write operations and reads at the same time.
- Support for high availability of services and a reputation for durability.
Some of the cons of PostgreSQL include:
- Can be complex to set up and manage, particularly for newcomers.
- Reliability comes at a cost to performance.
- Large databases used in complex applications can be memory intensive.
- Less community support than MySQL/MariaDB.
MySQL: pros and cons
The pros of MySQL include:
- MySQL’s storage engines enable fast performance.
- A small footprint and an easy-to-use replication system make it easy to grow and scale.
- Strong open-source community support.
- Nearly all web applications and enterprise systems support MySQL.
And here are some cons of choosing MySQL:
- Not as scalable as PostgreSQL or newer database systems.
- Lack of advanced features like full-text search and complex data types.
- Less resilience when processing complex queries.
- No built-in support for backups, requiring the use of third-party backup software.
PostgreSQL and MySQL: which to choose?
Both PostgreSQL and MySQL are extremely capable RDBMS packages. While PostgreSQL clearly supports more advanced features and has a greater reputation for reliability, that doesn’t mean MySQL is a bad choice.
For smaller and medium-sized web applications, MySQL’s relative simplicity makes it a great choice. Those new to SQL and RDBMS applications in general can pick up the basics of MySQL quickly, making it a great choice for enterprises with limited IT resources. MySQL also has a strong community behind it, with decades of apps supporting MySQL.
If you will be dealing with a larger dataset or developing complex custom applications, PostgreSQL is an excellent choice. Its support for custom data types and the PL/pgSQL language make Postgres a favorite of sysadmins, web developers, and database administrators around the world.