Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: tell fdupes to always prefer a certain directory #132

Open
macau23 opened this issue Mar 3, 2020 · 19 comments
Open

RFE: tell fdupes to always prefer a certain directory #132

macau23 opened this issue Mar 3, 2020 · 19 comments

Comments

@macau23
Copy link

macau23 commented Mar 3, 2020

I'd like to be able to tell fdupes to always prefer to preserve files within a given directory if duplicates are found.. At the moment the order of presented duplicates is not deterministic and it means that deleting duplicate files takes a lot longer:

$ fdupes -r --delete .
[1] ./aaa/1.txt
[2] ./bbb/1.txt
Set 1 of 2, preserve files [1 - 2, all]: x

[1] ./bbb/2.txt
[2] ./aaa/2.txt
Set 1 of 2, preserve files [1 - 2, all]: x

After the suggestion:

$ fdupes -r --prefer ./aaa/ --delete .
[1] ./aaa/1.txt
[2] ./bbb/1.txt
Set 1 of 2, preserve files [1 - 2, all]: x

[1] ./aaa/2.txt
[2] ./bbb/2.txt
Set 1 of 2, preserve files [1 - 2, all]: x

This enables me to use --noprompt without losing files from the wrong directory.

@sandrotosi
Copy link

I'd like to see this feature too!

@da-sti
Copy link

da-sti commented Jan 6, 2021

very needed feature to automatically delete all the doubles of pictures imported and imported again from and to different devices

@bjhartin
Copy link

bjhartin commented Feb 18, 2022

Can PR #144 be reviewed and merged?

This is a very needed feature for situations where directory A has some of the same files as B, but not in the same structure.

Happened to me when trying to merge a relative's photo library with mine. They had a lot of my pics but had rearranged them.

I want to run fdupes -dNr -keep=A A B

@Friday13th87
Copy link

jdupes, a fdupes fork, is doing that with "-O"
As this feature is requested since years i assume that it will never come to fdupes

@jbruchon
Copy link

jdupes, a fdupes fork, is doing that with "-O"

I'm the author of jdupes. No, it does not. That's the parameter order priority flag and only controls sorting.

@Friday13th87
Copy link

Friday13th87 commented Nov 16, 2022

jdupes, a fdupes fork, is doing that with "-O"

I'm the author of jdupes. No, it does not. That's the parameter order priority flag and only controls sorting.

the order priority_flag is controlling the sorting, meaning you can say that duplicates should rather be deleted in dir2 then in dir1 for example. ok, so far so good.

the question of the topic was:

"tell fdupes to always prefer a certain directory"

The ability to set priorities is exactly doing that "prefer a directory"
Meaning: jdupes -rNdO dir1/ dir2/ is setting the preserve priority to dir1, so dir1 is first and duplicates will be deleted in dir2 rather then in dir1, and that was the question.

My answer was correct in every way.

@jbruchon
Copy link

No, it's not. Your ego is not in question here; your correctness is. The parameter order controls the sorting, not the "preserve priority." Deletions will gladly nuke items in the first directory specified. It'll delete files in dir1 all day long. The request was to "always prefer to preserve files within a given directory." -O will (probably) preserve the first file in dir1 but all the rest of the files in dir1 in the set will be deleted. Your answer is only correct for the simplistic example in the original post. Most data sets are not nearly so simple. "The parameter order flag will 'always prefer to preserve files within a given directory'" is a false statement.

I will not entertain further discussion on this. You can't tell me I don't know how the program I wrote works.

@Friday13th87
Copy link

No, it's not. Your ego is not in question here; your correctness is. The parameter order controls the sorting, not the "preserve priority." Deletions will gladly nuke items in the first directory specified. It'll delete files in dir1 all day long. The request was to "always prefer to preserve files within a given directory." -O will (probably) preserve the first file in dir1 but all the rest of the files in dir1 in the set will be deleted. Your answer is only correct for the simplistic example in the original post. Most data sets are not nearly so simple. "The parameter order flag will 'always prefer to preserve files within a given directory'" is a false statement.

I will not entertain further discussion on this. You can't tell me I don't know how the program I wrote works.

No, sorry you are not right andthis is not about my ego, its about your ego sadly, i just wanted to help and you aredoing exactly the opposite to proof i-dont-know-what.

The initial poster was searching for a solution to prefer one driectory over another, which means "if possible delete from directory x and not from y, if its not possible do what you have to do"
ṕrefering one directory over another doesnt mean that the initial poster was searching a solution to prohibit deletions from one directory, just prefering to delete from one directory that if there is a duplicate in both dirs it will be left at one specific dir.
jdupe dir1/ dir2/ is doing this.

and for @bjhartin
with jdupes you can do as you wish easily with:

chmod 555 -R dir1/ [--> jdupes cant delete files here, but calculate hashes etc.]
jdupes -rNdO dir1/ dir2/
chmod 755 -R dir1/ [or whatever privilegs you like to give the folder]

i hope that helped.

@JohnCrafton
Copy link

JohnCrafton commented Nov 17, 2022

No, it's not. Your ego is not in question here; your correctness is.

@Friday13th87 You're being really unhelpful. The fellow said he's the author of jdupes; shut it down. Whether you think you're right no longer matters.

You're giving advice you claim as authoritative when the author of the program refuted you.

To anyone visiting this thread (and likely any others with this Friday person): caveat emptor.

I came looking for a way to do this thing, too, incidentally. I'd love a way to --prefer /some/arbitrary/master/path in one of these tools.

I suppose it's back to setting the "master" as read-only and running fdupes to see if it blows up.

@jbruchon
Copy link

jbruchon commented Nov 17, 2022

The fellow said he's the author of jdupes; shut it down. Whether you think you're right no longer matters.

To be fair: I didn't write every piece of code in jdupes and it's entirely possible to trip over my own human errors. The code behind -O, however, I personally wrote and tested. I know exactly what it does and there's a good chance I don't have dementia (yet). Fortunately, I can be completely mentally broken and anyone can still see exactly how it works.

@jbruchon
Copy link

@JohnCrafton you might find the example scripts in the jdupes code base to be useful. I recognized that many people want to perform custom actions that the core program doesn't handle, so I wrote some template/example shell scripts that can be modified to suit your needs. They should also be able to use fdupes instead of jdupes as long as you check the options passed to the program. The output format is the same (duplicate items one per line with an empty line between duplicate sets). You can use grep to match a substring and decide to not act upon a specific directory or file, for example.

@adrianlopezroche
Copy link
Owner

adrianlopezroche commented Nov 17, 2022

I came looking for a way to do this thing, too, incidentally. I'd love a way to --prefer /some/arbitrary/master/path in one of these tools.

I suppose it's back to setting the "master" as read-only and running fdupes to see if it blows up.

If you don't need it to run unsupervised (via -N) then you can use the new fdupes interactive mode to do this:

selb /some/arbitrary/master/path
isel
ds
prune

The first command will select every file in your "master" path, the second will deselect those and instead select their duplicates, the third will mark the now selected ones for deletion, and the last one will delete them.

@101Dude
Copy link

101Dude commented Oct 23, 2023

Been awhile. Somewhere I saw this recommended for choosing which to delete:

fdupes -r dir1 dir2|grep dir1/|xargs rm

I can't get that to work on macOS, and I am sure someone here can suggest why. This is an alternative method of getting what you want.

@macau23
Copy link
Author

macau23 commented Nov 28, 2023

@101Dude you should not use rm with xargs, it will do the wrong thing with spaces or files that need quoting.

@101Dude
Copy link

101Dude commented Nov 29, 2023

@macau23
this is what I ended up using and it works well.

xargs runs into issues when path names have special characters.
fdupes doesn't have a -print0 option like find does - it trips up.

The following command results in an error because of a single quote in a filename:

fdupes -r dir1 dir2|grep dir1/|xargs rm

xargs: unterminated quote

The UNIX way around this is to add another command between the grep and xarg commands:

... | tr '\n' '\0' | xargs -0 -n1 ...

This addition comes from an excellent explanation at Make xargs execute the command once for each line of input

The full command would then be:

fdupes -r dir1 dir2 |grep "dir2/" |tr '\n' '\0' |xargs -0 -n1 rm -v

Check this command first using echo or another non-destructive command before using rm.
Adding the -v option allows you to see what has been removed.

An example of a non-destructive option is to use the tag command (install with homebrew).
Add a red Finder tag to files that are duplicates so you can manually select and drag to trash :)

fdupes -r dir1 dir2 | grep "dir2" | tr '\n' '\0' | xargs -0 -n1 -I % tag -a red %

@sylvainsab
Copy link

sylvainsab commented Mar 4, 2024

same request

@VD171
Copy link

VD171 commented Mar 25, 2024

Any solution?

@skitchin
Copy link

skitchin commented May 27, 2024

Here's how you can delete duplicates without removing files from a specific directory. You'll use the -o option with double slashes with full path to set the priority order. For example:

fdupes -rdN -o name //pictures/photo1 /pictures/photo2

This command will delete duplicates found in the photo2 directory, keeping the files in photo1.

If you have three or more directories, add slashes in the order of priority. For instance, with four directories:

fdupes -rdN -o name ////pictures/photo1 ///pictures/photo2 //pictures/photo3 /pictures/photo4

This setup ensures that any duplicates found in photo1 and photo2 will be deleted from photo2. Similarly, duplicates found in photo2 and photo4 will be deleted from photo4.

I hope this helps.

@jsalatiel
Copy link

Here's how you can delete duplicates without removing files from a specific directory. You'll use the -o option with double slashes with full path to set the priority order. For example:

fdupes -rdN -o name //pictures/photo1 /pictures/photo2

This command will delete duplicates found in the photo2 directory, keeping the files in photo1.

If you have three or more directories, add slashes in the order of priority. For instance, with four directories:

fdupes -rdN -o name ////pictures/photo1 ///pictures/photo2 //pictures/photo3 /pictures/photo4

This setup ensures that any duplicates found in photo1 and photo2 will be deleted from photo2. Similarly, duplicates found in photo2 and photo4 will be deleted from photo4.

I hope this helps.

Clever!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.