How to remove duplicate files with PowerShell

PublishedNovember 21, 2022

I am a developer from Nebraska in the US.

So, you have a folder full of files, the names look similar, and you’re not sure what is in them. Storage space is tight, and you do not want to keep duplicate copies. How can you remove the duplicate files while avoiding the painstaking work of sorting through each one? With PowerShell and file hashes, the answer is, surprisingly easy. For files that are exact copies of each other, the file hash is the same, so the operation is to find the file hashes for a group of files and remove only one of the duplicates. Here is the complete operation to remove duplicate files with an explanation that follows.

Get-ChildItem “C:\FilePath\*FileName.file” | Get-FileHash | Group-Object -property hash | Where-Object { $_.count -gt 1 } | ForEach-Object { $_.group | Select-Object -skip 1 } | Remove-Item -Verbose

The first step is to get a list of the files that are duplicated, this is done with Get-ChildItem and the path and name of the files. This list is then sent to the Get-FileHash command which retrieves the file hash for each file in the list. The file hashes are then grouped by the hash number with Group-Object so the duplicate file hashes will be next to each other in the list. This is important for the next step.

The grouped list is then sent to Where-Object which counts the list for file hashes where the value is greater than one. This will identify only the duplicated files from the list. The grouped objects are then sent to ForEach-Object which creates a loop to run through each set of duplicate files. In each duplicate object set, one file is skipped and not retained in the list that is sent to Remove-Item to delete the file. Now duplicates should have been removed.

#powershell #windows

Comments

Join the discussion

No comments yet. Be the first to comment.

More from this blog

Catch and Store Error Messages in PowerShell

When you are in a flow, nothing is worse than having to stop to deal with a console error message. It would be great if the errors could be handled by the script and packaged into helpful messages. Happily, PowerShell offers this capability to catch,...

May 10, 20232 min read

Catch and Store Error Messages in PowerShell

Git - How to Rebase a Fork

If you have ever forked a repository on github, you have likely encountered the terms merge and rebase when researching how to interact with the original repository. Merging and rebasing are two different methods to solve the same problem, namely, ho...

Dec 2, 20222 min read

Git - Add the workflow scope to your Personal Access Token

If you use a personal access token to authenticate to github, and you should be using one, you may have encountered the below error when trying to push updates to a remote repository. (refusing to allow a Personal Access Token to create or update wor...

Dec 2, 20221 min read

Git - Add the workflow scope to your Personal Access Token

A Git Primer for Beginners

This short guide is intended to flatten the git learning curve for a beginner. The guide will cover the basics of getting started with git on the command line. Cloning a Repository Cloning a repository is the git term for downloading a remote project...

Dec 2, 20223 min read

Fixing Command Not Found in the Mac Terminal

If you have ever installed a new library or language on your mac, you have probably experienced the frustration of receiving “Command Not Found” when trying to use it in the terminal. This is especially challenging for beginners who may wonder what h...

Dec 2, 20223 min read

Fixing Command Not Found in the Mac Terminal

Unemployed Philosopher

15 posts

I am a teacher, a writer, and a dad. I enjoy learning software and writing about my experience. I also love baseball, cooking, and reading old books.

Command Palette

Comments

More from this blog