Skip to content

Commit

Permalink
📗 Update documentation for 'fdup' and 'rdup'
Browse files Browse the repository at this point in the history
  • Loading branch information
evrignaud committed Oct 10, 2016
1 parent ae4f2d6 commit 9ca6bfb
Show file tree
Hide file tree
Showing 11 changed files with 109 additions and 40 deletions.
4 changes: 2 additions & 2 deletions samples/simple-example.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ set -x
fim st || exit $?

echo
echo \# Search for duplicated files
echo \# Search for duplicate files
fim fdup || exit $?

echo
Expand All @@ -103,7 +103,7 @@ set -x
fim st || exit $?

echo
echo \# No duplicated files as we are looking only inside the dir01
echo \# No duplicate files as we are looking only inside the dir01
fim fdup || exit $?

echo
Expand Down
71 changes: 62 additions & 9 deletions src/main/asciidoc/docs/en/dealing-with-duplicates.adoc
Original file line number Diff line number Diff line change
@@ -1,18 +1,71 @@
= Dealing with duplicates

Duplicated files are addressed by Fim in two different ways.
Duplicate files are addressed by Fim in two different ways.

== Duplicates inside a Fim repository

Fim allow you to detect duplicates using the `fdup` command. It displays the list of duplicated files. +
See it in action in <<simple-example.adoc#_search_for_duplicated_files,Search for duplicated files>>.
Fim allow you to detect duplicates using the `fdup` command.

If you want to remove them, Fim won't do it. It does not provide a smart way to remove duplicates inside the same repository.
You can also remove them.

=== Find duplicates

Fim is able to display duplicates contained in a repository using the `fdup` (`find-duplicates`) command.
It displays the list of duplicate files. +
See it in action in <<simple-example.adoc#_search_for_duplicate_files,Search for duplicate files>>.

[source, bash]
----
$ fim fdup
----

If the current State is already commited, you can skip the workspace scanning phase with the `-l` option :

[source, bash]
----
$ fim fdup -l
----

=== Remove duplicates

You can remove duplicate files.

* Either interactive:

[source, bash]
----
$ fim rdup
----

• Or automatically preserving the first file in the list:

[source, bash]
----
$ fim rdup -y
----

In both cases, it is possible to use the current State as with `fdup` by adding the `-l` option:

[source, bash]
----
$ fim rdup -l
----

== Duplicates that are outside

You can use Fim to remove duplicated files that are located outside a Fim repository using the `rdup` command.
It can be useful if you want to cleanup old backups that are no more synchronized and you want to be sure to not lose any files that could have been modified or added.
Fim can delete duplicate files contained in another repository. +
It can be useful if you want to cleanup old backups that are no more synchronized and you want to be sure to not lose any files that could have been modified or added. +
It erases all files locally that already exist in the master workspace.

For example, `backup` is a copy of the repository named `source` :

[source, bash]
----
$ cd backup
$ fim rdup -m ../source
----

When the workspace to clean is remote, you can just copy the `.fim` in an empty directory and set it as parameter to the `-m` option of the `rdup` command

=== Simple duplicates removing

Expand Down Expand Up @@ -146,7 +199,7 @@ Do you really want to commit (y/n/A)? y
------
~/rdup-example/source$ cd ../backup/
~/rdup-example/backup$ fim rdup -m ../source
2016/05/21 08:39:14 - Info - Searching for duplicated files using the ../source directory as master
2016/05/21 08:39:14 - Info - Searching for duplicate files using the ../source directory as master
2016/05/21 08:39:14 - Info - Scanning recursively local files, using 'full' mode and 4 threads
(Hash progress legend for files grouped 10 by 10: # > 1 GB, @ > 200 MB, O > 100 MB, 8 > 50 MB, o > 20 MB, . otherwise)
Expand All @@ -173,13 +226,13 @@ Do you really want to remove it (y/n/A)? A
'file10' is a duplicate of '../source/file10'
'file10' removed
8 duplicated files found. 8 duplicated files removed
8 duplicate files found. 8 duplicate files removed
------

[IMPORTANT]
=====
When you are prompted with a question asking for (y/n/A) which means Yes, No, or All Yes. +
All Yes will reply Yes to all the remaining questions. You can see it in action above.
'All Yes' will reply Yes to all the remaining questions. You can see it in action above.
=====

==== Only the two modified files remains
Expand Down
2 changes: 1 addition & 1 deletion src/main/asciidoc/docs/en/faq.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Doing this allow you to:

- Quickly find the modifications done in this specific sub-directory. You will hash only the files contained inside and not the complete file tree
- Quickly commit the modifications done in this sub-directory
- Quickly find the duplicated files contained in this sub-directory
- Quickly find the duplicate files contained in this sub-directory and remove them
- Quickly reset the attributes of files contained in this sub-directory

All the other commands will run as if you were on the top of the Fim repository.
Expand Down
9 changes: 5 additions & 4 deletions src/main/asciidoc/docs/en/fim-usage.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,9 @@ Available commands:
rfa / reset-file-attrs Reset the files attributes like they were stored in the last committed State
dcor / detect-corruption Find changes most likely caused by a hardware corruption or a filesystem bug.
Change in content, but not in creation time and last modified time
fdup / find-duplicates Find local duplicated files in the Fim repository
rdup / remove-duplicates Remove duplicated files from local directory based on a remote master Fim repository
fdup / find-duplicates Find local duplicate files in the Fim repository
rdup / remove-duplicates Remove duplicates found by the 'fdup' command.
If you specify the '-m' option it removes duplicates based on a master repository
log Display the history of the States with the same output as the 'status' command
dign / display-ignored Display the files or directories that are ignored into the last State
rbk / rollback Rollback the last commit. It will remove the last State
Expand All @@ -41,9 +42,9 @@ Available options:
You can specify multiple kind of difference to ignore separated by a comma.
For example: -i attrs,dates,renamed
-l,--use-last-state Use the last committed State.
Only for the find local duplicated files command
Both for the 'find-duplicates' and 'remove-duplicates' commands
-m,--master-fim-repository <arg> Fim repository directory that you want to use as remote master.
Only for the remove duplicated files command
Only for the 'remove-duplicates' command
-n,--do-not-hash Do not hash file content. Uses only file names and modification dates
-o,--output-max-lines <arg> Change the maximum number lines displayed for the same kind of modification.
Default value is 200 lines
Expand Down
22 changes: 18 additions & 4 deletions src/main/asciidoc/docs/en/most-common-use-cases.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@

Fim can be used for different kind of use cases.

== Binary Workspace management
== Managing a workspace

* Manage directories filled with binary. For example: pictures, music or movies

* Know the status of a workspace in which we work episodically

* Track changes over time

Personally I use Fim to manage my photos and videos.
When I have new photos, I put them at the right place in my pictures folder and then I do `fim&nbsp;ci` from the sub-directory
containing the new photos to record a new State, as I could do with Git.
Expand All @@ -13,9 +20,16 @@ More details on using Fim from a sub-directory can be found in <<faq.adoc#_run_f

The `fim status` command let me know when I want (even super quickly) if something changed in my pictures folder.

== Duplicates removal
I can easily identify and delete the photos I have duplicated on my drive or another computer with the command `fim rdup`. +
For this, I just need to copy the `.fim` directory on the other computer. +
== Duplicates detection and removal

Fim detects duplicate files and distinguishes two cases:

* Duplicates inside a Fim repository: +
Fim can detect and remove them

* Duplicates that are outside: +
Useful to cleanup desynchronized old backups

More details in <<dealing-with-duplicates.adoc#_dealing_with_duplicates,Dealing with duplicates>>.

== Backup integrity
Expand Down
12 changes: 6 additions & 6 deletions src/main/asciidoc/docs/en/simple-example.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -174,12 +174,12 @@ Deleted: file06
1 added, 1 copied, 3 duplicated, 1 date modified, 2 content modified, 1 renamed, 1 deleted
----

=== Search for duplicated files
=== Search for duplicate files

[source, bash]
----
simple-example$ fim fdup
2016/05/09 21:58:37 - Info - Searching for duplicated files
2016/05/09 21:58:37 - Info - Searching for duplicate files
2016/05/09 21:58:37 - Info - Scanning recursively local files, using 'full' mode and 2 threads
(Hash progress legend for files grouped 10 by 10: # > 1 GB, @ > 200 MB, O > 100 MB, 8 > 50 MB, o > 20 MB, . otherwise)
Expand All @@ -195,7 +195,7 @@ simple-example$ fim fdup
file07
file07.dup1
3 duplicated files spread into 2 duplicate sets, 36 bytes of total wasted space
3 duplicate files spread into 2 duplicate sets, 36 bytes of total wasted space
----

=== From the `dir01` sub-directory
Expand Down Expand Up @@ -225,18 +225,18 @@ Added: dir01/file01
1 added
----

There are no duplicated file as we are looking only inside `dir01`.
There are no duplicate file as we are looking only inside `dir01`.

[source, bash]
----
simple-example/dir01$ fim fdup
2016/05/09 21:58:37 - Info - Searching for duplicated files
2016/05/09 21:58:37 - Info - Searching for duplicate files
2016/05/09 21:58:37 - Info - Scanning recursively local files, using 'full' mode and 2 threads
(Hash progress legend for files grouped 10 by 10: # > 1 GB, @ > 200 MB, O > 100 MB, 8 > 50 MB, o > 20 MB, . otherwise)
2016/05/09 21:58:38 - Info - Scanned 1 file (12 bytes), hashed 12 bytes (avg 12 bytes/s), during 00:00:00
No duplicated file found
No duplicate file found
----

Commit only the local modifications done inside this directory.
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/fim/Fim.java
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ private Options buildOptions() {
opts.addOption(buildOption("d", "directory", "Run Fim into the specified directory").hasArg().build());
opts.addOption(buildOption("e", "errors", "Display execution error details").build());
opts.addOption(buildOption("m", "master-fim-repository", "Fim repository directory that you want to use as remote master.\n" +
"Only for the remove duplicated files command").hasArg().build());
"Only for the 'remove-duplicates' command").hasArg().build());
opts.addOption(buildOption("n", "do-not-hash", "Do not hash file content. Uses only file names and modification dates").build());
opts.addOption(buildOption("s", "super-fast-mode", "Use super-fast mode. Hash only 3 small blocks.\n" +
"One at the beginning, one in the middle and one at the end").build());
Expand All @@ -122,7 +122,7 @@ private Options buildOptions() {
"You can specify multiple kind of difference to ignore separated by a comma.\n" +
"For example: -i attrs,dates,renamed").hasArg().valueSeparator(',').build());
opts.addOption(buildOption("l", "use-last-state", "Use the last committed State.\n" +
"Only for the find local duplicated files command").build());
"Both for the 'find-duplicates' and 'remove-duplicates' commands").build());
opts.addOption(buildOption("c", "comment", "Comment to set during init and commit").hasArg().build());
opts.addOption(buildOption("o", "output-max-lines", "Change the maximum number lines displayed for the same kind of modification.\n" +
"Default value is 200 lines").hasArg().build());
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/fim/command/FindDuplicatesCommand.java
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ public String getShortCmdName() {

@Override
public String getDescription() {
return "Find local duplicated files in the Fim repository";
return "Find local duplicate files in the Fim repository";
}

@Override
Expand All @@ -48,7 +48,7 @@ public Object execute(Context context) throws Exception {

fileContentHashingMandatory(context);

Logger.info(String.format("Searching for duplicated files%s", context.isUseLastState() ? " from the last committed State" : ""));
Logger.info(String.format("Searching for duplicate files%s", context.isUseLastState() ? " from the last committed State" : ""));
Logger.newLine();

State state;
Expand Down
11 changes: 6 additions & 5 deletions src/main/java/org/fim/command/RemoveDuplicatesCommand.java
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,8 @@ public String getShortCmdName() {

@Override
public String getDescription() {
return "Remove duplicated files from local directory based on a remote master Fim repository";
return "Remove duplicates found by the 'fdup' command.\n" +
" If you specify the '-m' option it removes duplicates based on a master repository";
}

@Override
Expand Down Expand Up @@ -112,7 +113,7 @@ public Object execute(Context context) throws Exception {
}
context.setRepositoryRootDir(masterFimRepository);

Logger.info(String.format("Searching for duplicated files using the %s directory as master", context.getMasterFimRepositoryDir()));
Logger.info(String.format("Searching for duplicate files using the %s directory as master", context.getMasterFimRepositoryDir()));
Logger.newLine();

State masterState = new StateManager(context).loadLastState();
Expand Down Expand Up @@ -142,13 +143,13 @@ public Object execute(Context context) throws Exception {

if (totalFilesRemoved == 0) {
if (duplicatedFilesCount == 0) {
Logger.out.println("No duplicated file found");
Logger.out.println("No duplicate file found");
} else {
Logger.out.printf("Found %d duplicated %s. No files removed%n", duplicatedFilesCount, pluralForLong("file", duplicatedFilesCount));
Logger.out.printf("Found %d duplicate %s. No files removed%n", duplicatedFilesCount, pluralForLong("file", duplicatedFilesCount));
}
} else {
Logger.newLine();
Logger.out.printf("%d duplicated %s found. %d duplicated %s removed%n",
Logger.out.printf("%d duplicate %s found. %d duplicate %s removed%n",
duplicatedFilesCount, pluralForLong("file", duplicatedFilesCount),
totalFilesRemoved, pluralForLong("file", totalFilesRemoved));
}
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/fim/internal/StateComparator.java
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ private void searchForDifferences() {
}
List<FileState> removed = notFoundInCurrentFileStateList.removeAll(originalFileHash);
if (removed != null && removed.size() > 0) {
// Used to check other duplicated files that have been renamed
// Used to check other duplicate files that have been renamed
foundInPreviousState.put(originalFileHash, originalFileState);
}
} else {
Expand Down
8 changes: 4 additions & 4 deletions src/main/java/org/fim/model/DuplicateResult.java
Original file line number Diff line number Diff line change
Expand Up @@ -75,20 +75,20 @@ public DuplicateResult displayAndRemoveDuplicates() {

if (filesRemoved == 0) {
if (duplicatedFilesCount > 0) {
Logger.out.printf("%d duplicated %s, %s of total wasted space%n",
Logger.out.printf("%d duplicate %s, %s of total wasted space%n",
duplicatedFilesCount, pluralForLong("file", duplicatedFilesCount), byteCountToDisplaySize(totalWastedSpace));
} else {
Logger.out.println("No duplicated file found");
Logger.out.println("No duplicate file found");
}
} else {
Logger.out.printf("Removed %d files and freed %s%n", filesRemoved, byteCountToDisplaySize(spaceFreed));
long remainingDuplicates = duplicatedFilesCount - filesRemoved;
long remainingWastedSpace = totalWastedSpace - spaceFreed;
if (remainingDuplicates > 0) {
Logger.out.printf("Still have %d duplicated %s, %s of total wasted space%n",
Logger.out.printf("Still have %d duplicate %s, %s of total wasted space%n",
remainingDuplicates, pluralForLong("file", remainingDuplicates), byteCountToDisplaySize(remainingWastedSpace));
} else {
Logger.out.println("No duplicated file remains");
Logger.out.println("No duplicate file remains");
}
}
return this;
Expand Down

0 comments on commit 9ca6bfb

Please sign in to comment.