question about deletion and counts

classic Classic list List threaded Threaded
3 messages Options
Jeff Templon Jeff Templon
Reply | Threaded
Open this post in threaded view
|

question about deletion and counts


simeto:~> notmuch search --output=files tag:deleted | wc -l
     666
simeto:~> notmuch search --format=text0 --output=files tag:deleted | xargs -0 rm

afterwards, from notmuch new:

No new mail. Removed 577 messages. Detected 89 file renames.

577 + 89 = 666 ... my guess is that there were 577 messages and 89 files
that represented duplicates of messages.  But I didn't rename the files,
I deleted them.  Should I worry?  Why is the message inaccurate?

JT

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Carl Worth-2 Carl Worth-2
Reply | Threaded
Open this post in threaded view
|

Re: question about deletion and counts

On Sat, Nov 03 2018, Jeff Templon wrote:
> No new mail. Removed 577 messages. Detected 89 file renames.
>
> 577 + 89 = 666 ... my guess is that there were 577 messages and 89 files
> that represented duplicates of messages.

Yes. Among the messages you had tagged as deleted you had 577 unique
message IDs. In addition, you had another 89 files with message IDs that
were duplicates of one of the 577.

> But I didn't rename the files, I deleted them.  Should I worry?

Nope. Nothing to worry about here.

> Why is the message inaccurate?

Because notmuch has a more narrow view of what a "rename" is than you
do.

A file rename is a high-level operation that will be seen by notmuch as
multiple operations seen over the course of a single run of notmuch
new:

  1. A new file is added with a message ID that already exists in the
     database

  2. A file is removed with a message ID for which there are multiple
     files in the database

But notmuch doesn't detect whether both of these operations are seen in
a single pass in order to detect a rename. Instead, what it is doing is
counting every occurence of (2) above as a rename. Here's what the code
looks like (notmuch-new.c:remove_filename):

    status = notmuch_database_remove_message (notmuch, path);
    if (status == NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID) {
        add_files_state->renamed_messages++;
        if (add_files_state->synchronize_flags == true)
            notmuch_message_maildir_flags_to_tags (message);
        status = NOTMUCH_STATUS_SUCCESS;
    } else if (status == NOTMUCH_STATUS_SUCCESS) {
        add_files_state->removed_messages++;
    }

So, whenever removing a filename, it will either get counted as a rename
(if there is still at least one other filename in the database with the
same message ID), or it will get counted as a removal (if this was the
last filename for message ID).

I suppose you could come up with some other name for what it is
counting, such as "removals of duplicate messages" instead of "rename",
but that's what's happening.

I hope that helps.

-Carl

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (847 bytes) Download Attachment
Jeff Templon Jeff Templon
Reply | Threaded
Open this post in threaded view
|

Re: question about deletion and counts

Hi Carl,

Thanks for your answer.

Carl Worth <[hidden email]> writes:

> A file rename is a high-level operation that will be seen by notmuch as
> multiple operations seen over the course of a single run of notmuch
> new:
>
>   1. A new file is added with a message ID that already exists in the
>      database
>
>   2. A file is removed with a message ID for which there are multiple
>      files in the database
>
> But notmuch doesn't detect whether both of these operations are seen in
> a single pass in order to detect a rename. Instead, what it is doing is
> counting every occurence of (2) above as a rename. Here's what the code
> looks like (notmuch-new.c:remove_filename):
>
>     status = notmuch_database_remove_message (notmuch, path);
>     if (status == NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID) {
> add_files_state->renamed_messages++;
> if (add_files_state->synchronize_flags == true)
>    notmuch_message_maildir_flags_to_tags (message);
> status = NOTMUCH_STATUS_SUCCESS;
>     } else if (status == NOTMUCH_STATUS_SUCCESS) {
> add_files_state->removed_messages++;
>     }


Perfect explanation, thanks.

> I suppose you could come up with some other name for what it is
> counting, such as "removals of duplicate messages" instead of "rename",
> but that's what's happening.

Yes, that'd be my suggestion :-) It's one of my personal buttons that
sometimes get pushed "name is misleading".  If you seriously consider
it, I'd suggest "file reassignments" instead of "file renames".  A file
rename to me is

       mv jeff.txt carl.txt

the file was named jeff.txt but was renamed to carl.txt.  The case you
describe, a file with a certain name is either assigned to a messageID,
or de-assigned to that messageID - the actual file name is not changed,
as I understand it.

Anyway thanks for the explanation!  Good that I don't need to worry.

BTW I've got integration between org and notmuch up and running now, I'm
really liking this capability.

JT
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch