23 July, 2007

HowTo make a copy of web site

Sometimes it is necessary to copy content from the remote web site.
GNU Wget is the best solution for this purpose.
Wget is a standard utility in most of Linux-like systems (and in the Cygwin) for downloads of files from the Web.

Good idea to wrap wget into the shell script like this:


xqx_teleport.sh

#!/bin/bash

if test -z "$1"; then
echo "Create local copy of http site."
echo "Usage: echo $0 <target URL>"
exit
fi

DATE=`date +%Y.%m.%d_%H.%M.%S.%N`
LOG=~/`basename $0`-$DATE.log
touch $LOG

if test ! -w "$LOG"; then
LOG=/dev/stdout
fi

wget --continue --recursive --no-parent --relative --convert-links $1 |tee $LOG


For example, if you want to copy documents from the site "http://www.w3.org/TR/xhtml11" you should execute this command:
xqx_teleport.sh http://www.w3.org/TR/xhtml11

No comments: