5👍
I think you misunderstand the benefits and cost of synchronous commits on replication. In PostgreSQL, replication works by recovering slaves up to the master, using the standard crash recovery features of PostgreSQL. In the event that, for example, the power goes out, you can be sure that the write-ahead log segments will be run on both master and slave. With asynchronous commit, the commit is written to the WAL, the application is notified and the slave is notified more or less all at the same time depending on network characteristics, etc.
Where synchronous commit comes in handy is where two things are true:
- You have more than one slave (this is critical!) and
- You need durability guarantees that asynchronous commits can’t offer you.
With synchronous commit, the database waits until it hears back from a configurable number of slaves to tell the application that the commit has happened. This offers durability guarantees in a few cases where asynchronous commits are unable to work.
For example, suppose your master server takes a bullet through a raid array and immediately crashes (sorry, I couldn’t think of any better examples with good hardware). Or suppose someone trips on a power cord and not only powers off the server but corrupts the software RAID device. In this case it is possible that a couple of transactions may not have been replicated and your WAL is unrecoverable, so those transactions are lost. With synchronized commit, the application would have waited until durability guarantees were met.
One thing this means is that if you do synchronous commit with only one slave your availability cannot outlast a crash on either master or slave, so your availability will be worse than it would have been with just one server. It also means that if your slave is geographically removed, that you introduce significant latency in your application’s request to commit transactions.
Switching from async to sync commit it not a big change, but generally, I think that you get the most out of sync commit when you have already done as much as you can assurance and availability-wise on your hardware already. Start with async and move up when you can’t further optimize your setup as async.
4👍
Re: “Slony and PGPool II seem to be pretty popular (Slony, in particular) for replication. Is there A) a particular, technical reason for their popularity over built-in replication or B) is it just because a lot of people are using versions that don’t have built-in replication, or C) have I been under a rock and PG built-in replication is already quite popular?”
Slony is popular because it has been around for quite a long time, and the built-in PostgreSQL replication is relatively new. Cascading replication built in to PostgreSQL is even newer, and is something that Slony-I was built with.
There are two main advantages to Slony-I, first, you can replicate between differing versions of PostgreSQL, whereas the built-in replication system not only must use the same version, but the two servers must also be binary compatible. The other advantage is that you can replicate only certain tables on Slony-I instead of the whole database cluster. The disadvantages of Slony-I are numerous, and include poor user-friendliness, no synchronous commits, and difficult DDL (schema) changes. I believe that use of the built-in replication in Postgres will quickly exceed the Slony-I user base if it hasn’t already done so.
As far as I remember, PGPool II uses statement-based replication (like what MySQL has had built-in), and is definitely the least desirable, in my opinion.
I would use the built-in hot standby/streaming replication in PostgreSQL. You can start with synchronous commit turned on and turn it off if you don’t need it or the penalty is too high, or vice versa. Over a LAN, asynchronous mode seems to reach the slave in the order of a hundred milliseconds or so (from my own experience).
- [Django]-A pip install doesn't add anything to pipFile or pipFile.lock
- [Django]-Django static templatetag not displaying SVG
- [Django]-Can't add object to Django's Many-To-Many Field
- [Django]-My custom heroku python buildpack downloads requirements on every push
- [Django]-Postgres connection details in django and chef