Why is my archive/ directory on my Centos 7 Postgresql 10.4 installation several hundred GibiBytes?
First of all, “GibiBytes,” you may ask. Yes. It’s what Centos and perhaps Unix in general uses for file sizes. It’s just 2³⁰ bytes, whereas GB or GigaBytes is 10⁹ bytes. Now that we have that behind us:
Are you noticing the archive/ directory size getting out of hand?
You’ll probably notice that archive_mode is set to on by default.
### Here is what the Archiving section of the postgresql.conf file looks like by default:# - Archiving -archive_mode = on # enables archiving; off, on, or always
# (change requires restart)archive_command = 'cp %p /var/lib/pgsql/10/archive/%f' # command to use to archive a logfile segment
If your install is in this commonly used location, then check
Not to be patronizing, but if you find it helpful, run this find command to locate your postgresql.conf file:
# find / -name postgresql.conf
Then you can vim (or preferred editor) the conf file and edit it so it looks like this:
# - Archiving -archive_mode = off # enables archiving; off, on, or always
# (change requires restart)### archive_command = 'cp %p /var/lib/pgsql/10/archive/%f' # command to use to archive a logfile segment
Anyway, that’s what I did. To turn off the default Postgres archiving, just set
archive_mode = off in the postgresql.conf file.
You’ll also have to restart postgresql for the change to take effect. I used:
# service postgresql-10 restart
-- then go ahead and check the status
# systemctl status postgresql-10.service
I wanted to check
# ps aux | grep archive to just make sure this process was no longer running.
I’m using streaming replication from server1 to server2, so I wanted to make sure to check
[root@aserver1 ~]# su - postgres
Last login: Fri Aug 31 23:47:52 EDT 2018 on pts/1
-bash-4.2$ psql -c "select application_name, state, sync_priority, sync_state from pg_stat_replication;"
application_name | state | sync_priority | sync_state
pgslave01 | streaming | 1 | sync
Then to check further, I began inserting records into the primary server1 and checking that the replica was getting the inserts on server2.
Next I wanted to zip up the archive/ directory into a tarball using gzip -z algorithm, with:
# tar -zcvf archive.tar.gz archive/
This will move through each 16M archive file, zip it, then put all of that into a single tar file.
But, if you’re reaching the upper limits of the space available on your partition, you might not have the space to create this extra tar file.
The approach I used was to
cd into the archive/ directory (
/var/lib/pgsql/10/archive)and run this command:
# gzip -v --rsyncable *
The cool thing about this approach is that the command will iterate through each 16M archive WAL file, replacing each original file with gzip’d file. The -v option will echo line by line the progress of the operation and the
--rsyncable option will make sure the output of the operation will play nice with rsync. After the process completes, you can rsync the whole archive directory elsewhere.
As a side note, when you’re digging around looking for why your disk is running out of space, these commands might come in handy, but if you’ve read this far, you’re probably already familiar with them:
# df -h which will show you the big picture with percentages and GibiBytes
# du -sch * | sort -h which is great for listing the size of the contents of directories and ordering those from least to greatest, so the biggest space-taking directories show up conveniently .
I sincerely hope this offering is coherent and helpful.