Is there a way to get rsync to print the full filepaths to all files that are different without actually transferring any files?
Alternatively, I need a way to diff the files across two trees (over SSH) based only on change in size or last-modified time.
7 Answers
Rsync has a dry-run option:
-n, --dry-run show what would have been transferredI am not sure if this is what you want.
If you want to diff the files across two trees, you could maybe recursively search the two directions with find and pipe to output to ls and pipe both to a file. You could then use diff to compare the files.
I prefer to use the --out-format to see the details, piping it to less:
rsync -azh --dry-run --delete-after --out-format="[%t]:%o:%f:Last Modified %M" source destination | less rsync -rvn localdir targetdir
The -n means to show actions only (without any action being performed).
Note that you need the 'v' or it won't show you anything! (the rest of the answers forget this...)
1Building on other answers and
- use
--dry-run(or-n) to avoid modification - use
--itemize-changes(or-i) to find changes - use
--archive(or-a) to get all subdirectories - use
egrepto filter out entries starting by dot (no change)
Which gives you: rsync -nia source destination | egrep -v "sending incremental file list" | egrep -v "^\."
If you just want one way, you may change the command:
- for changes from source to destination:
rsync -nia source destination | egrep -v "sending incremental file list" | egrep -v "^(\.|<)" - for changes from destination to source:
rsync -nia source destination | egrep -v "sending incremental file list" | egrep -v "^(\.|>)"
And if you need only the files, just add awk magic: rsync -nia source destination | egrep -v "sending incremental file list" | egrep -v "^\." | awk '{print $2}'
The truth of the matter is that if you run rsync -v ... and it outputs a filename to the screen, that file is being transferred (or would have been transferred, if you are doing a --dry-run). To ascertain why rsync was about to transfer it, use itemize mode:
As others have noted, by default rsync just compares based on file size and timestamp, which both have to match else a "delta copy" is started on that file. If you really want to see which files are different, use "-c" checksum mode.
I would go for something like this:
#! /bin/bash
set -eu ## Stop on errors and on undefined variables
## The local directory name
LOCAL_DIR=$1
## The remote directory in rsync sintax. Example: "machine:directory"
REMOTE_DIR=$2
shift
shift
# Now the first two args are gone and any other remaining arguments, if any,
# can be expanded with $* or $@
# Create temporary file in THIS directory (hopefully in the same disk as $1:
# we need to hard link, which can only made in the same partition)
tmpd="$(mktemp -d "$PWD/XXXXXXX.tmp" )"
# Upon exit, remove temporary directory, both on error and on success
trap 'rm -rf "$tmpd"' EXIT
# Make a *hard-linked* copy of our repository. It uses very little space
# and is very quick
cp -al "$LOCAL_DIR" "$tmpd"
# Copy the files. The final «"$@"» allows us to pass on arguments for rsync
# from the command line (after the two directories).
rsync -a "$REMOTE_DIR"/ "$tmpd/" --size-only "$@"
# Compare both trees
meld "$LOCAL_DIR" "$tmpd"For example:
$ cd svn
$ rsyncmeld myproject othermachine:myproject -v --exclude '*.svn' --exclude build In my case I wanted to see the difference between two backups that I already had where the newest was in a directory called backup.0 and the previous in backup.1. rsync would output every single file in backup.0 when using
rsync -a --delete --itemize-changes --dry-run ./backup.0 ./backup.1but what I actually had to do was add / to the end of the directories:
rsync -a --delete --itemize-changes --dry-run ./backup.0/ ./backup.1/