Trace Dupes

  • Language: Perl
  • Written: 2011
  • Requirements: find, md5sum Duplicate files to trace

I had a wad of duplicate files on the system and wanted to bulk delete them. But not manually. A very quick & dirty script that does the trick; no use strict or whatever. Not my usual code style, but I had precious little time to get it going!

A perl solution with md5 and bash! And a dash of extra bash for the deletion afterwards.

The command

perl dupes.pl > dupes.txt

The code:

#!/usr/bin/perl

my @files;
my %data;
my $list;

# Verbosity because this can take a while
print "Gather Files list using find \n";

open (OUTPUT, "find . -type f |");
while (<OUTPUT>){
  push(@files,  $_);
  }
close OUTPUT;

# Verbosity because this can take a while
print "Create md5hash for all files \n";

# Loop over files and mark the dupes using a hash array
foreach $fi (@files) {
        $hash = &getmd5($fi);

        if ( $data{$hash} ) {
                $list .= $hash . "," unless ($list =~ /$hash/);
        }

        $data{$hash} .= $fi;
}

# Print the dupe hashes
print $list . "\n\n";

# Print the files; hash and corresponding files
foreach $dub (split(",", $list)) {
        print $dub . " : \n" . $data{$dub} . "\n";
}



# Create md5 hash
sub getmd5() {
        my $file = shift;

        # slash difficult characters
        $file =~ s/ /\\ /g;
        $file =~ s/\(/\\\(/g;
        $file =~ s/\)/\\\)/g;
        $file =~ s/\'/\\\'/g;
        $file =~ s/&/\\&/g;

        $commando = 'md5sum ' . $file . ' |';
        my $data = "";

        # Verbosity because this can take a while & for debug
        print $commando;
        open (MDF, $commando);
        while (<MDF>){
         $data .= $_;
        }
        close MDF;


        ($sum,$file) = split "  ", $data;

        return $sum;
}

The grep

grep "Some Map Criteria" dupes.txt  > rmfile

The Delete:
- WARNING - This might just delete your entire system, depending on whats in that file. I'd advise you Do NOT execute this command!

 while IFS= read -r file; do [ -f "$file" ] && rm -f "$file";      done < ./rmfile 

More about me

Site QR Code

Use this code at your own peril; I am not responsible for anything that happens to you or your devices. You re a big boy or girl, please do some research before using it if you have any concerns!