tar
is an archiving utility, it creates and extracts (compressed) archives (aka tarballs).
1tar -czpf foo.tar.gz sourceFiles file1 file2 # creates compressed archive
2tar -xpf foo.tar.gz # extracts archive
3tar -xpf foo.tar.gz -C dest/ # extracts archive in the `dest/` directory
c
or --create
creates
x
or --extract
extracts
z
or --gzip
/--gunzip
zip, compresses or uncompresses the archive with gzip
p
or --preserve-permissions
preserves file and directory permissions
f
provide the File name (foo.tar.gz in the above example)
Compress your backups for faster transfer, less bandwidth usage, and less disk space usage (you will get charged for the disk space and bandwidth if you’re transferring backups off-site, to a service like Amazon S3).
Since backups are usually automated, you can skip -v
for verbosity.
You can optionally preserve and restore file ownerships as well with the -s
, --preserve
--same-owner
flags.
s
from the archive (default for ordinary users)
--numeric-owner always use numbers for user/group names
--owner=NAME force NAME as owner for added files
-p, --preserve-permissions, --same-permissions
extract information about file permissions
(default for superuser)
--preserve same as both -p and -s
--same-owner try extracting files with the same ownership as
exists in the archive (default for superuser)
1tar -czpf foo.'/bin/date + \%y%m\%d'.tar.gz
bzip
is the best in terms of compression ration, but is very CPU and RAM intensivegzip
has a decent compression ratio, and a decent resource usageYou might want to for different reasons.. Let’s say you want to find out what date the files inside a tarball were backed up/created
1tar -tf foo.tar.gz # list the files in the tar archive
2tar -tvf foo.tar # list all files in foo.tar verbosely (permissions, ownerships, file size, time)
3tar --list -f foo.tar.gz # -t and --list are the same thing (equivalent of `tar -tf foo.tar.gz`)
1# tar -tf foo.tar.gz
2foo/
3foo/file2.txt
4foo/file3.txt
5foo/file9.txt
6foo/file4.txt
7foo/file1.txt
8foo/file5.txt
9foo/file8.txt
10foo/file7.txt
11foo/file6.txt
1# tar -tvf foo.tar.gz
2drwxr-xr-x root/root 0 2017-08-17 06:48 foo/
3-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file2.txt
4-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file3.txt
5-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file9.txt
6-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file4.txt
7-rw-r--r-- root/root 0 2017-08-17 06:48 foo/file1.txt
8-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file5.txt
9-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file8.txt
10-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file7.txt
11-rw-r--r-- root/root 0 2017-08-17 06:45 foo/file6.txt
Here’s a script that i have used on one of my sites. It creates a file backup of a website in /var/www
and saves it in a backups directory on the server. It also deletes backups older than 5 days, and can optionally sync backups to S3.
1DIR='/backups'
2TIMESTAMP=`date +%Y%b%d`
3YEAR=`date +%Y`
4
5# Create & Compress
6echo "Backing up: foo.com"
7tar -czpf ${DIR}/${TIMESTAMP}.foo.com.tar.gz /var/www/foo.com/public_html/
8
9echo "Success: backup created"
10
11# Delete old backups (older than 5 days)
12echo "Deleting old backups.."
13find ${DIR}/${YEAR}*.*.tar.gz -type f -lastmod +5 -delete
14# -delete might not work on all systems
15#find ${DIR}/${YEAR}*.*.tar.gz -type f -lastmod +5 -exec rm -f {} \;
16
17# Sync to S3
18# s3cmd sync /backups/ s3://s3.foo.com/
19# echo "Success: backup synced with S3"