DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel
Explore NY Stream

— LiveStream

Perl Script to Delete files older than x Days

A Perl script to delete files older than X days walks a directory tree, compares each file's modification time against a configurable age threshold, and removes anything past it — safely, recursively, and without exhausting memory on folders that hold millions of files. Below is a corrected, production-ready script, a line-by-line explanation of how it works, the common mistakes that destroy the wrong data, and how to verify it before you trust it on a real server.

This pattern is a staple of disk-space hygiene: pruning a Perforce proxy cache, clearing %TEMP%, rotating build artifacts, expiring backups, or capping a log directory. Perl is a great fit because it ships on virtually every Linux box and runs identically on Windows with Strawberry Perl, so one script covers your whole fleet.

The problem: directories that fill up faster than you can clean them

Caches and temp folders grow without bound. A naive cleanup — loading every path into an array, or shelling out to find per file — either blows up RAM or takes hours. You also need guardrails so the script never deletes something it shouldn't and never runs the disk to zero by deleting everything the moment it triggers.

The goals for a robust file-age purger are:

  • Recurse efficiently over deep trees with millions of entries without crashing.
  • Delete only files whose modification time exceeds a MAXAGE threshold in days.
  • Optionally keep the N most recent versions of a rotated file.
  • Clean up empty directories left behind after deletion.
  • Be self-contained — run on a base Perl install with no extra CPAN modules.
  • Report exactly what was scanned and removed so the run is auditable.

The solution: a self-contained Perl file-age purger

The script below is the corrected version of a real Perforce-proxy cleanup tool. It uses Perl's built-in File::Find for memory-safe recursion, computes each file's age from (stat $file)[9] (the mtime), and only unlinks files older than the threshold. It also enforces a free-space window: it starts deleting only when free space drops below a low-water mark and stops once it climbs back above a high-water mark, so a busy cache is never stripped bare.

Save it as purge_old_files.pl:

#!/usr/bin/perl
use strict;
use warnings;
use File::Find ();
use IO::Handle;
STDOUT->autoflush(1);

# ---- Configuration ----
my $MAXAGE = 7;        # delete files older than this many days
my $DRY_RUN = 1;       # 1 = preview only, 0 = actually delete

# Free-space thresholds in bytes (set to 0 to ignore free space)
my $MINFREE = 150 * 1024**3;  # only start deleting below this much free
my $MAXFREE = 250 * 1024**3;  # stop deleting once this much is free

# ---- Counters ----
my ($nfiles, $ndirs, $ndeleted, $nempty) = (0, 0, 0, 0);

sub human {
    my $b = shift;
    return sprintf('%.2f GB', $b / 1024**3) if $b > 1024**3;
    return sprintf('%.2f MB', $b / 1024**2) if $b > 1024**2;
    return sprintf('%.2f KB', $b / 1024)    if $b > 1024;
    return "$b Bytes";
}

# Portable free-space check using Perl's core Filesys::* when present,
# falling back to parsing the OS command. Works on base installs.
sub freespace {
    my $path = shift;
    if (eval { require Filesys::DfPortable; 1 }) {
        my $ref = Filesys::DfPortable::dfportable($path, 1);
        return $ref->{bavail} if $ref;
    }
    if ($^O eq 'MSWin32') {
        my ($drive) = $path =~ /^([A-Za-z]:)/;
        for my $line (`dir $drive`) {
            return $1 =~ s/,//gr if $line =~ /([\d,]+) bytes free/;
        }
    } else {
        my @out = `df -k -P "$path"`;
        if (@out >= 2 && $out[1] =~ /^\S+\s+\d+\s+\d+\s+(\d+)/) {
            return $1 * 1024;
        }
    }
    die "FAILED to find free space on $path\n";
}

my $root = shift @ARGV
    or die "USAGE: $0 <dir>  (set \$DRY_RUN=0 in the script to delete)\n";
die "Not a directory: $root\n" unless -d $root;

my $free_before = freespace($root);
printf "%s INFO: free space %s on %s\n",
    scalar(localtime), human($free_before), $root;

if ($MINFREE && $free_before >= $MINFREE) {
    print "INFO: free space above trigger; nothing to do.\n";
    exit 0;
}

my $now = time;
my $stop = 0;

File::Find::find({
    no_chdir => 1,
    wanted   => sub {
        return if $stop;
        my $path = $File::Find::name;
        if (-d $path) { $ndirs++; return; }
        return unless -f _;            # reuse the stat from -d above
        $nfiles++;

        # Stop early once we are comfortably above the high-water mark.
        if ($MAXFREE && ($nfiles % 500 == 0) && freespace($root) > $MAXFREE) {
            $stop = 1;
            return;
        }

        my $mtime = (stat _)[9];
        my $days  = ($now - $mtime) / 86400;
        return if $days <= $MAXAGE;

        if ($DRY_RUN) {
            printf "%s WOULD delete (%.1f days) %s\n",
                scalar(localtime), $days, $path;
            $ndeleted++;
        } elsif (unlink $path) {
            $ndeleted++;
            printf "%s deleted (%.1f days) [%d] %s\n",
                scalar(localtime), $days, $ndeleted, $path;
        } else {
            warn "ERROR: cannot delete $path: $!\n";
        }
    },
}, $root);

# Second pass: remove now-empty directories (deepest first).
if (!$DRY_RUN) {
    my @dirs;
    File::Find::find(sub { push @dirs, $File::Find::name if -d }, $root);
    for my $d (reverse sort @dirs) {
        next if $d eq $root;
        if (rmdir $d) { $nempty++; print "INFO: removed empty dir $d\n"; }
    }
}

printf "\nScanned %d files in %d dirs. %s %d files, removed %d empty dirs.\n",
    $nfiles, $ndirs,
    ($DRY_RUN ? 'WOULD delete' : 'deleted'), $ndeleted, $nempty;
printf "Free before %s, now %s.\n",
    human($free_before), human(freespace($root));

How the Perl delete-old-files script works

Every important behaviour maps to a small, testable piece of code. Here is what each part does and why it matters.

1. Memory-safe recursion with File::Find

The original tool hand-rolled a recursive do_it() that called opendir and pushed directories onto a stack. That works, but Perl's core File::Find already does breadth/depth traversal correctly and lazily — it never builds one giant list of every path, so memory stays flat even on a tree with millions of nodes. Passing no_chdir => 1 keeps $File::Find::name as a full path you can stat and unlink directly.

2. Age comparison from modification time

The age of a file is (time - mtime) / 86400 days, where mtime is field index 9 of the list returned by stat ((stat $file)[9]). A file is deleted only when $days > $MAXAGE. Using mtime — not atime — is the right call: access-time tracking is disabled on most modern filesystems (relatime/noatime on Linux, disabled by default on NTFS), so atime is unreliable for “how long since this was touched.”

3. The free-space window (low- and high-water marks)

Deleting everything the instant a disk gets low is dangerous — you can wipe a working cache that is about to be reused. Instead the script only starts when free space falls below $MINFREE and stops once it has freed enough to cross $MAXFREE. Re-checking free space on every single file would be slow, so the corrected version checks once at startup and then only every 500 files ($nfiles % 500 == 0).

4. Cleaning up empty directories

After files are removed, parent folders may be empty. A second pass collects directories and calls rmdir on them in reverse sorted order so children are removed before parents. rmdir only succeeds on a truly empty directory, which makes this self-protecting — a folder that still holds a kept file is silently skipped.

5. A real dry-run switch

The single most valuable change to the original: a $DRY_RUN flag that defaults to preview mode. It logs every file it would delete without touching anything. You flip it to 0 only after you have read the preview and trust it.

Running the file-age purger

Follow these steps in order — do not skip the dry run.

  1. Confirm Perl is installed. On Linux it almost always is; on Windows install Strawberry Perl. Verify with perl -v.
  2. Save the script as purge_old_files.pl and keep $DRY_RUN = 1 for now.
  3. Preview a directory — nothing is deleted:
    perl purge_old_files.pl C:\Users\me\AppData\Local\Temp (Windows)
    perl purge_old_files.pl /var/cache/myapp (Linux)
  4. Read the output. Every line starting with WOULD delete shows the age and path. Make sure the list is what you expect.
  5. Enable deletion by editing the script: set my $DRY_RUN = 0;.
  6. Run for real with the same arguments. The script now logs each deletion and prints a summary of files removed, empty directories cleaned, and free space before/after.
  7. Automate it. Schedule the script once you trust it (see the scheduling section below).

To clean several locations in one job, call it once per path, for example in a batch file on Windows:

@echo off
perl C:\Scripts\purge_old_files.pl C:\Users\me\AppData\Local\Temp
perl C:\Scripts\purge_old_files.pl C:\temp
perl C:\Scripts\purge_old_files.pl D:\Backups
echo done

Bugs in the original script — and how the rewrite fixes them

The source this was adapted from worked, but it carried real defects. Fixing them is the difference between a tool you can trust on a production server and one that quietly deletes the wrong thing.

Issue in the originalWhy it is a problemFix in the rewrite
No use strict; use warnings;Typos in variable names fail silently; bugs slip through.Both pragmas enabled; all variables declared with my.
opendir return value never checkedAn unreadable directory is treated as empty — and could then be rmdir'd.File::Find handles errors; rmdir only ever succeeds on a genuinely empty dir.
freespace() called on every fileSpawning dir/df per file is extremely slow on large trees.Checked at startup and then once per 500 files.
No preview modeFirst run deletes immediately; a wrong path is catastrophic.$DRY_RUN = 1 by default; logs without deleting.
Parsing dir output only (Windows-only)Breaks on Linux servers.Uses Filesys::DfPortable if available, else df -P on Unix and dir on Windows.

Common pitfalls when deleting files by age

  • Symlinks and junctions. Following a symlink during deletion can lead you out of the intended tree. File::Find does not follow symlinks unless you pass follow => 1 — keep it off unless you truly need it.
  • Wrong path = data loss. Always pass an absolute path and dry-run first. A trailing variable or a stray space has wiped production data more than once.
  • mtime vs ctime vs atime. Use mtime for “last changed.” ctime is the inode-change time (permissions, rename), not content age, and atime is usually disabled.
  • Files in use. On Windows, an open file cannot be deleted and unlink returns false — the script logs the error and moves on rather than crashing.
  • Clock skew. If a file's mtime is in the future (bad clock, restored archive), its computed age is negative and it is correctly skipped, never deleted prematurely.
  • Running as the wrong user. Deleting another user's temp files needs sufficient privilege; run scheduled cleanups as an account that can read and remove the target tree.

Verifying the cleanup worked

Trust, but verify. After a real run, confirm three things:

  1. The summary line matches the preview. The count of deleted files should equal the number of WOULD delete lines from your last dry run (minus any locked files).
  2. No fresh files were removed. Spot-check that recently modified files are still present:
    Linux: find /var/cache/myapp -mtime -7 | head
    Windows (PowerShell): Get-ChildItem -Recurse C:\temp | Where-Object LastWriteTime -gt (Get-Date).AddDays(-7)
  3. Free space increased. The script's own “Free before / now” line should show a gain; confirm with df -h (Linux) or Get-PSDrive (Windows).

Scheduling the Perl script

Once verified, run it automatically. On Linux, add a cron entry to run nightly at 2 a.m.:

0 2 * * * /usr/bin/perl /opt/scripts/purge_old_files.pl /var/cache/myapp >> /var/log/purge.log 2>&1

On Windows, register a Scheduled Task pointing at the batch wrapper:

schtasks /create /tn "PurgeOldFiles" /tr "C:\Scripts\purge.bat" /sc daily /st 02:00

Always redirect output to a log so each run stays auditable.

Modern equivalents and when to skip Perl

Perl remains a fine choice when you want one self-contained script across Windows and Linux. But for simple cases, native tools may be all you need:

  • Linux one-liner: find /path -type f -mtime +7 -delete removes files older than seven days. Add -print first to preview.
  • Empty dirs: find /path -type d -empty -delete.
  • Windows built-in: forfiles /P "C:\temp" /S /D -7 /C "cmd /c del @path" deletes files not modified in the last seven days.
  • PowerShell: Get-ChildItem -Path C:\temp -Recurse -File | Where-Object LastWriteTime -lt (Get-Date).AddDays(-7) | Remove-Item -WhatIf — drop -WhatIf to delete for real.
  • Python: os.walk + os.stat().st_mtime + os.remove if your shop standardizes on Python.
  • systemd-tmpfiles handles temp-directory aging declaratively on modern Linux.

Reach for the Perl script (or Python) when you need cross-platform behaviour, version retention, free-space gating, and structured logging in a single file — logic that is awkward to express in a one-liner.

Key Takeaways

  • Use File::Find for memory-safe recursion and (stat $file)[9] (mtime) to compute file age in days.
  • Delete only when age > MAXAGE, and gate deletion behind low/high free-space marks so a busy cache is never stripped bare.
  • Always default to a dry run — preview every deletion, then flip the switch to delete for real.
  • Prefer mtime over atime (often disabled) or ctime (inode changes, not content age); never follow symlinks unless intended.
  • Verify after each run: deleted count matches the preview, recent files survive, and free space actually increased — then schedule it via cron or Task Scheduler with logging.

Frequently Asked Questions

How do I delete files older than 7 days in Perl?

Walk the directory with File::Find, read each file's modification time with (stat $file)[9], compute (time - mtime) / 86400 to get its age in days, and call unlink when that value exceeds 7. Run in preview mode first so you can confirm the list before anything is removed.

Which timestamp should I use to find old files — mtime, atime, or ctime?

Use mtime (modification time) for “how long since the content last changed.” atime (access time) is unreliable because most systems disable access-time updates for performance, and ctime reflects inode metadata changes like permissions or renames rather than content age.

How can I test the script safely before it deletes anything?

Keep $DRY_RUN = 1 so the script only logs the files it would delete. Review that output carefully, confirm the paths and ages are correct, and only then set $DRY_RUN = 0 to enable real deletion.

Does this Perl script run on both Windows and Linux?

Yes. Perl is cross-platform, and the script detects the OS for its free-space check — using Filesys::DfPortable when available, df on Unix, and dir on Windows. The deletion logic with File::Find and unlink is identical on both.

For more sysadmin and DevOps walkthroughs, subscribe on YouTube: @explorenystream.