You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
This repo is archived. You can view files and clone it, but cannot push or open issues/pull-requests.
Acrimon df6cf8a280
Added webp, jfif and tiff support.
4 months ago
walk-image-fingerprint Added webp, jfif and tiff support. 4 months ago
LICENSE.txt Added readme. 4 months ago Remove old readme info. 4 months ago Handle csv header in find-duplicate. 4 months ago


Tools I use for filtering and deduplicating images. These are licensed under GPL-3.0-only. A copy of the license can be found in the LICENSE.txt file in this repository.


This tool is meant for fingerprinting images for deduplication purposes. Useful for finding duplicates of images across image formats and resolutions.

This program is written in Rust and needs Rust installed in order to be compiled. Once Rust is installed the program can be compiled by running the cargo build --release command inside of the walk-image-fingerprint directory. The output executable is located in the target/release subdirectory.

The output csv has two columns. The first column contains the file path and the second column contains the fingerprint. The program takes two cli arguments. The first argument is the folder to scan for images and the second argument is the name of the output file.

./walk-image-fingerprint /home/brandon/Pictures metadata.csv

This example invocation will scan the /home/brandon/Pictures folder and write the results to a file named metadata.csv in the current directory.

After fingerprinting is done duplicates can be detected by running the python script and supplying the previous output file as the first argument.