I have been using Wget, and I have run across an issue. I have a site,that has several folders and subfolders within the site. I need to download all of the contents within each folder and subfolder. I have tried several methods using Wget, and when i check the completion, all I can see in the folders are an "index" file. I can click on the index file, and it will take me to the files, but i need the actual files.
does anyone have a command for Wget that i have overlooked, or is there another program i could use to get all of this information?
site example:
within the Pictures DIr, there are several folders.....
America/California/JoeUser.jpg
I need all files, folders, etc.....
23 Answers
I want to assume you've not tried this:
wget -r --no-parent
or to retrieve the content, without downloading the "index.html" files:
wget -r --no-parent --reject "index.html*"
Reference: Using wget to recursively fetch a directory with arbitrary files in it
4I use wget -rkpN -e robots=off
-r means recursively
-k means convert links. So links on the webpage will be localhost instead of
-p means get all webpage resources so obtain images and javascript files to make website work
properly.
-N is to retrieve timestamps so if local files are newer than files on remote website skip them.
-e is a flag option it needs to be there for the robots=off to work.
robots=off means ignore robots file.
I also had -c in this command so if they connection dropped if would continue where it left off from when i re-run the command. I figured -N would go well with -c
wget -m -A * -pk -e robots=off
this will download all type of files locally and point to them from the html file
and it will ignore robots file