List contents of large tar archive quickly

I need to list the contents of a fairly large (gzipped) tar archive (about 37GiB). tar -ztvf archive takes forever. Is there some way to get this listing quickly? Possibly by only listing contents down to a certain directory depth?

Could I have done something while packing the archive to enable quicker listing of the contents?

1

2 Answers

So, you have a set of files, that you have made an archive using "tar", which will create one output file. Then, you used "gzip" to compress that one ".tar" file into a ".zip".

If that's the case, then the process of getting a listing of all the files in the ".tar" file, requires that the ".tar" file must be fully unzipped from the ".zip" file, before the contents of the ".tar" file can be listed.

This will always take more time than getting a listing of the files that are stored "directly" in a ".zip" file. If the file is large, this time can be considerable.

If you want to reduce the time required to list the files in a compressed archive file, then:

  1. unzip (uncompress) the ".tar" file from the ".zip" file, and expand the resulting ".tar" file into the underlying fileset

  2. create a ".zip" file directly from the set of files.

This way, the files in the archive file can be listed without uncompressing the archive.

0

-z means to ungzip or gzip the tar file. Tar itself bundles without compression. When you gzip there isn't much you can do, the metadata is in the tar file which is compressed,

$ gunzip -l co.tar.gz compressed uncompressed ratio uncompressed_name 177183 1044480 83.0% file.tar

That's all gzip sees. You've got to gunzip to get back to the tar. The Unix way to get around this is simply to either generate a manifest, find . > MANIFEST and ship with that. However there are other options like rather than using .tar.gz simply ship out a different format that does not compress metadata. I think 7zip fits that description.

More recently, people have taken to using sha1sum or md5sum which is a better plan than a regular MANIFEST.

find destination/ -type f -exec sha1sum {} +; > sha1sum

This can also be checked rather easily but is unnecessary because both tar and gzip provide checksums internally,

sha1sum -c ./sha1sum

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like