Wednesday, July 8, 2009

Trip Html comments with sed without changing file attributes

The following script* changes a file but preserves its modification time (using touch). Perl is the key point here which provides the file attributes in the desired format for touch input

#!/bin/bash
# ALL HTML FILES
FILES="*.htm*"
# for loop read each file
for f in $FILES
do
INF="$f"
OUTF="$f.out.tmp"
ts=`/usr/bin/perl preserve.pl $f`
sed -f tag.sed $INF > $OUTF
/usr/bin/cp $OUTF $INF
/usr/bin/rm -f $OUTF
/usr/bin/touch -c -t $ts $INF
done
Preserve.pl ***

#!/usr/bin/perl -w
$filename = "$ARGV[0]";
@attrs = stat($filename);

use Time::Format qw(%time %strftime %manip);

print $strftime{'%Y%m%d%H%M.%S',$attrs[8]}

* This script is mostly adapted from http://www.cyberciti.biz/faq/sed-howto-remove-lines-paragraphs/

** Slightly modified the script from http://sed.sourceforge.net/grabbag/scripts/strip_html_comments.sed

*** Uses the Format.pm file which resides on http://search.cpan.org/~roode/Time-Format-1.11/lib/Time/Format.pm

No comments:

Post a Comment