proposing "notmuch purge"

classic Classic list List threaded Threaded
15 messages Options
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

proposing "notmuch purge"

This e-mail proposes a new notmuch subcommand "purge", which actually
removes explicitly deleted messages from the mailstore.

Notmuch currently never deletes mail, but notmuch-emacs makes it easy to
tag mail with "deleted" (via the "d") key, and "notmuch setup"
automatically adds "deleted" to the search.exclude_tags setting.

Users typically do actually want to delete messages, and they want them
gone from their filesystem and from the index.

while everyone who has used notmuch for a while probably has a clever
way of doing this, those techniques are all probably slightly different
(and possibly buggy), and the cognitive burden of figuring out how to do
this sensibly for new users seems like something we should avoid.

So i'm proposing "notmuch purge", which could be something as simple as
the equivalent of:

   notmuch search --output=files --format=text0 tag:deleted | \
      xargs --null --no-run-if-empty rm && \
         notmuch new --no-hooks

(credit for the pipeline above goes to anarcat, in Cc; i added the
"notmuch new --no-hooks" part, because i would want the items gone from
the db as well)

If i was to implement this, i'd probably implement it directly in C, not
as a shell script, because this lets us drop messages from the db as we
unlink() the files.

Inevitably, someone will come up with some more clever
options/generalizations (i can already think of at least one), but if we
have a particular implementation to hang these proposals on, it should
help us to build something sensibly robust with a wider consensus, and
new users can pick up and use that functionality easily/safely/with
confidence.

I note that this is a divergence from the historical expectation of
having all "notmuch" subcommands not directly tamper with the
mailstore.  I think given the context that divergence is OK.

Any objections to this approach?

    --dkg

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (233 bytes) Download Attachment
Antoine Beaupré-3 Antoine Beaupré-3
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

On 2020-01-13 17:28:38, Daniel Kahn Gillmor wrote:
> This e-mail proposes a new notmuch subcommand "purge", which actually
> removes explicitly deleted messages from the mailstore.
>
> Notmuch currently never deletes mail, but notmuch-emacs makes it easy to
> tag mail with "deleted" (via the "d") key, and "notmuch setup"
> automatically adds "deleted" to the search.exclude_tags setting.
>
> Users typically do actually want to delete messages, and they want them
> gone from their filesystem and from the index.

I certainly do! :)

> while everyone who has used notmuch for a while probably has a clever
> way of doing this, those techniques are all probably slightly different
> (and possibly buggy), and the cognitive burden of figuring out how to do
> this sensibly for new users seems like something we should avoid.

Agreed.

> So i'm proposing "notmuch purge", which could be something as simple as
> the equivalent of:
>
>    notmuch search --output=files --format=text0 tag:deleted | \
>       xargs --null --no-run-if-empty rm && \
>          notmuch new --no-hooks
>
> (credit for the pipeline above goes to anarcat, in Cc; i added the
> "notmuch new --no-hooks" part, because i would want the items gone from
> the db as well)

I don't quite understand that last bit. I deliberately do *not* run
notmuch-new in my notmuch-purge script:

https://gitlab.com/anarcat/scripts/blob/master/notmuch-purge

... because it's setup as a pre-new hook, so it runs right before
new. So it doesn't need to call new.

> If i was to implement this, i'd probably implement it directly in C, not
> as a shell script, because this lets us drop messages from the db as we
> unlink() the files.

I also agree it might be faster than forking like crazy and rescanning
the entire DB.

But maybe we can just start with a shell wrapper for now. That's how
many git subcommands start, by the way, and it might just be "good
enough" for most people.

> Inevitably, someone will come up with some more clever
> options/generalizations (i can already think of at least one), but if we
> have a particular implementation to hang these proposals on, it should
> help us to build something sensibly robust with a wider consensus, and
> new users can pick up and use that functionality easily/safely/with
> confidence.
>
> I note that this is a divergence from the historical expectation of
> having all "notmuch" subcommands not directly tamper with the
> mailstore.  I think given the context that divergence is OK.

Well, we're already tampering with the mailstore: we're changing flags!
:)

> Any objections to this approach?

Not from me, I've been advocating for data destruction for years
now. Happy to get one more on my crew! ;)

A.

--
People arbitrarily, or as a matter of taste, assigning numerical values
to non-numerical things. And then they pretend that they haven't just
made the numbers up, which they have. Economics is like astrology in
that sense, except that economics serves to justify the current power
structure, and so it has a lot of fervent believers among the powerful.
                        - Kim Stanley Robinson, Red Mars
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Teemu Likonen Teemu Likonen
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Daniel Kahn Gillmor
Daniel Kahn Gillmor [2020-01-13T17:28:38-05] wrote:

> So i'm proposing "notmuch purge", which could be something as simple as
> the equivalent of:
>
>    notmuch search --output=files --format=text0 tag:deleted | \
>       xargs --null --no-run-if-empty rm && \
>          notmuch new --no-hooks
>
> (credit for the pipeline above goes to anarcat, in Cc; i added the
> "notmuch new --no-hooks" part, because i would want the items gone from
> the db as well)
I agree with the proposal but I would like to add one important point to
the discussion and semantics. If the implementation goes through
"notmuch search" we should understand what search.exclude_tags does.

Let's say a user has this settings: "search.exclude_tags=deleted;spam".
Then "notmuch search tag:deleted" will not find messages which have both
of the excluded tags, "deleted" and "spam". We would need "notmuch
search --exclude=false tag:deleted" to really find all messages with
tag:deleted. So here's the search semantics I propose:

    notmuch search --exclude=false --output=files \
        --format=text0 SEARCH-TERMS

I think that the "SEARCH-TERMS" part should be configurable, not
hard-coded. A user could have setting like
"search.purge_tags=deleted;spam" and that would lead to search terms
"tag:deleted OR tag:spam" in the purge operation.

--
///  OpenPGP key: 4E1055DC84E9DFF613D78557719D69D324539450
//  https://keys.openpgp.org/search?q=tlikonen@...
/  https://keybase.io/tlikonen  https://github.com/tlikonen

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (707 bytes) Download Attachment
Teemu Likonen Teemu Likonen
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

Teemu Likonen [2020-01-14T07:01:08+02] wrote:

>     notmuch search --exclude=false --output=files \
>         --format=text0 SEARCH-TERMS
>
> I think that the "SEARCH-TERMS" part should be configurable, not
> hard-coded.

Obviously there is no need for configuration if purging is just a
command that user runs manually or in his own scripts: "notmuch purge
SEARCH-TERMS". Configuration is needed if some (mail client) operation
does purging automatically.

--
///  OpenPGP key: 4E1055DC84E9DFF613D78557719D69D324539450
//  https://keys.openpgp.org/search?q=tlikonen@...
/  https://keybase.io/tlikonen  https://github.com/tlikonen

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (707 bytes) Download Attachment
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Teemu Likonen
On Tue 2020-01-14 07:01:08 +0200, Teemu Likonen wrote:
> We would need "notmuch search --exclude=false tag:deleted" to really
> find all messages with tag:deleted.

I agree that we ought to deliberately avoid the exclude_tags when
purging.

> I think that the "SEARCH-TERMS" part should be configurable, not
> hard-coded. A user could have setting like
> "search.purge_tags=deleted;spam" and that would lead to search terms
> "tag:deleted OR tag:spam" in the purge operation.

I want the user to be able to run "notmuch purge", with no arguments, to
"Do What I Mean"™

I also want the "purge" subcommand to have its own configuration
space -- it's *not* a specialized form of "search".

So if we choose to make it configurable, i have two (mutually-exclusive)
counter-proposals to yours above.  In either case, if the user supplies
search terms, they are used instead of pulling from the config

a) the config variable is "purge.tags", and by default (if no setting is
   present) its value is "deleted".  "notmuch purge" with no search
   terms expands this value into a tags-based selection.

b) "notmuch purge" with no search terms looks for a special stored query
   named "query.purge".  If that is not present, it uses "tags:deleted".

(b) is slightly more flexible than (a), in that the user can configure
it to use arbitrary queries, not just tags, though both behave
identically in a default (not-explicitly-configured) setup.

fwiw, i'd also like to consider exposing this functionality from
libnotmuch, not just from the cli (so that it can be used by MUAs that
are based solely on the library), but if there are good arguments for
avoiding this in the library, i'd be happy to hear them.

            --dkg


_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (233 bytes) Download Attachment
Rollins, Jameson Rollins, Jameson
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

On Tue, Jan 14 2020, Daniel Kahn Gillmor <[hidden email]> wrote:

>> I think that the "SEARCH-TERMS" part should be configurable, not
>> hard-coded. A user could have setting like
>> "search.purge_tags=deleted;spam" and that would lead to search terms
>> "tag:deleted OR tag:spam" in the purge operation.
>
> I want the user to be able to run "notmuch purge", with no arguments, to
> "Do What I Mean"™
>
> I also want the "purge" subcommand to have its own configuration
> space -- it's *not* a specialized form of "search".

Honestly I don't see the point of any user configuration here.  Seems
likely to only add confusion and possibly improperly deleted messages,
which would be very bad.

Just use the "deleted" tag only.  It's already being used in multiple
place to mean that the message should be deleted.

jamie.
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Antoine Beaupré-3 Antoine Beaupré-3
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

On 2020-01-14 11:55:36, Jameson Graef Rollins wrote:

> On Tue, Jan 14 2020, Daniel Kahn Gillmor <[hidden email]> wrote:
>>> I think that the "SEARCH-TERMS" part should be configurable, not
>>> hard-coded. A user could have setting like
>>> "search.purge_tags=deleted;spam" and that would lead to search terms
>>> "tag:deleted OR tag:spam" in the purge operation.
>>
>> I want the user to be able to run "notmuch purge", with no arguments, to
>> "Do What I Mean"™
>>
>> I also want the "purge" subcommand to have its own configuration
>> space -- it's *not* a specialized form of "search".
>
> Honestly I don't see the point of any user configuration here.  Seems
> likely to only add confusion and possibly improperly deleted messages,
> which would be very bad.
>
> Just use the "deleted" tag only.  It's already being used in multiple
> place to mean that the message should be deleted.

Agreed. If you want to delete messages matching an another tag, you just
run:

    notmuch tag +deleted tag:another
    notmuch purge

Composability wins over configurability in this case. :)

A.

--
Le péché est né avant la vertu, comme le moteur avant le frein.
                         - Jean-Paul Sartre
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

On Tue 2020-01-14 15:03:29 -0500, Antoine Beaupré wrote:
> Agreed. If you want to delete messages matching an another tag, you just
> run:
>
>     notmuch tag +deleted tag:another
>     notmuch purge
>
> Composability wins over configurability in this case. :)

I like this outcome, though i'm not sure i like the *argument* for
composability necessarily.  If we're talking about end-user workflow,
most folks don't want to be able to compose.

At any rate, though i'd be happy with a simpler subcommand, with no
configurability initially.

The man page is shorter too :)  And, there's less of a need to think
about providing the user with a warning if they ask to do something
really crazy like "notmuch purge '*'"

What do folks think about exposing a "purge" interface in the C and
Python APIs as well?  Should that also be similarly un-parameterized?

  --dkg

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (233 bytes) Download Attachment
Ryan Tate Ryan Tate
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Daniel Kahn Gillmor
Daniel Kahn Gillmor <[hidden email]> writes:

> So i'm proposing "notmuch purge", which could be something as simple as
> the equivalent of:
>
>    notmuch search --output=files --format=text0 tag:deleted | \
>       xargs --null --no-run-if-empty rm && \
>          notmuch new --no-hooks
>
> (credit for the pipeline above goes to anarcat, in Cc; i added the
> "notmuch new --no-hooks" part, because i would want the items gone from
> the db as well)

Is there any other notmuch command that results in a change to the state
of actual mail files, as opposed to the database?

Personally, I would be surprised to learn that the command "notmuch
purge" deleted actual emails on my filesystem. I would expect any
notmuch command would only operate on the database. As far as I can tell
-- and I could be forgetting something! -- the current suite of commands
simply mutate the database, never the actual files.

What I would expect to happen is that "notmuch purge" removes mails
tagged "deleted" from the notmuch index. (And perhaps with a flag, like
say "--rmfiles", would take the step of actually deleting files.)

Of course, I like to think I'd read the manpage of a command involving
the word "purge" before executing said command :-) But I think I'd be
surprised when I did, in this case.

Just my $.02.

(Thank you to anyone on thread who has helped build notmuch, it has
helped me enormously.)

          Ryan
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Brian May-2 Brian May-2
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Daniel Kahn Gillmor
Daniel Kahn Gillmor <[hidden email]> writes:

> So i'm proposing "notmuch purge", which could be something as simple as
> the equivalent of:

I can't help think it will only be a matter of time before somebody
mistypes the search spec and accidentally deletes all their mail...

Of course, this won't be a drama, because said user will revert to
up-to-date backup, created just before manually entering risky command,
right? ;-)
--
Brian May <[hidden email]>
https://linuxpenguins.xyz/brian/
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Rollins, Jameson Rollins, Jameson
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Ryan Tate
On Tue, Jan 14 2020, Ryan Tate <[hidden email]> wrote:
> Is there any other notmuch command that results in a change to the state
> of actual mail files, as opposed to the database?

If maildir.synchronize_flags is set true then maildir flags in message
file names will be synchronized with tags (see notmuch-config(1)).

jamie.
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Brian May-2
On Wed 2020-01-15 09:59:14 +1100, Brian May wrote:

> Daniel Kahn Gillmor <[hidden email]> writes:
>
>> So i'm proposing "notmuch purge", which could be something as simple as
>> the equivalent of:
>
> I can't help think it will only be a matter of time before somebody
> mistypes the search spec and accidentally deletes all their mail...
>
> Of course, this won't be a drama, because said user will revert to
> up-to-date backup, created just before manually entering risky command,
> right? ;-)
I agree that's a risk, which is why (further downthread) i am finding
myself liking the idea of omitting the search terms argument entirely,
in favor of a very rigid interface that allows only "notmuch purge" (no
arguments or options).

in id:[hidden email], i wrote:

>> [with the rigid interface], there's less of a need to think
>> about providing the user with a warning if they ask to do something
>> really crazy like "notmuch purge '*'"

would that satisfy your fears?

   --dkg

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (233 bytes) Download Attachment
Antoine Beaupré-3 Antoine Beaupré-3
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Ryan Tate
On 2020-01-14 17:48:34, Ryan Tate wrote:

> Daniel Kahn Gillmor <[hidden email]> writes:
>> So i'm proposing "notmuch purge", which could be something as simple as
>> the equivalent of:
>>
>>    notmuch search --output=files --format=text0 tag:deleted | \
>>       xargs --null --no-run-if-empty rm && \
>>          notmuch new --no-hooks
>>
>> (credit for the pipeline above goes to anarcat, in Cc; i added the
>> "notmuch new --no-hooks" part, because i would want the items gone from
>> the db as well)
>
> Is there any other notmuch command that results in a change to the state
> of actual mail files, as opposed to the database?

As jrollins said, mail flags can be changed, but that's about it. Email
contents are never modified and files are never removed, by principle,
so far.

> Personally, I would be surprised to learn that the command "notmuch
> purge" deleted actual emails on my filesystem. I would expect any
> notmuch command would only operate on the database. As far as I can tell
> -- and I could be forgetting something! -- the current suite of commands
> simply mutate the database, never the actual files.
>
> What I would expect to happen is that "notmuch purge" removes mails
> tagged "deleted" from the notmuch index. (And perhaps with a flag, like
> say "--rmfiles", would take the step of actually deleting files.)
>
> Of course, I like to think I'd read the manpage of a command involving
> the word "purge" before executing said command :-) But I think I'd be
> surprised when I did, in this case.

That's an excellent point!! If we have one user that has that
misconception and runs the command and destroys email, maybe it's worth
thinking more about ways of preventing such catastrophes! :)

Maybe some --force argument would be useful here? Or would "notmuch
delete" be more obvious to you? Or maybe a confirmation dialog unless
--yes or --force is passed?

> (Thank you to anyone on thread who has helped build notmuch, it has
> helped me enormously.)

Hey that's a nice touch, thanks! (even though I haven't done much on
notmuch)

(pun intended)

(and probably failed)

a.
--
La destruction de la société totalitaire marchande n'est pas une affaire
d'opinion. Elle est une nécessité absolue dans un monde que l'on sait
condamné. Puisque le pouvoir est partout, c'est partout et tout le temps
qu'il faut le combattre. - Jean-François Brient, de la servitude moderne
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Teemu Likonen Teemu Likonen
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

In reply to this post by Rollins, Jameson
Jameson Graef Rollins [2020-01-14T11:55:36-08] wrote:

> Honestly I don't see the point of any user configuration here.  Seems
> likely to only add confusion and possibly improperly deleted messages,
> which would be very bad.
>
> Just use the "deleted" tag only.  It's already being used in multiple
> place to mean that the message should be deleted.

That is indeed simple. Here is just one more point of view for deleting.

I like "trashcan with expiry time": I manually mark some messages
"deleted" which make them disappear because "deleted" messages are
excluded from searches. The messages are really deleted when they are
older than 30 days.

So I tend to prefer user's own search terms in "notmuch purge". My purge
command would be something like this:

    notmuch purge --no-db-update \
        "(" tag:deleted OR tag:spam ")" AND date:..30days

In my system that kind of operation runs automatically in pre-new hook.
Then new messages are fetched from server and the database is updated.

If "deleted" is hard-coded tag name for unconditional purging operation
the trashcan functionality can be built with some other tag like
"trash".

--
///  OpenPGP key: 4E1055DC84E9DFF613D78557719D69D324539450
//  https://keys.openpgp.org/search?q=tlikonen@...
/  https://keybase.io/tlikonen  https://github.com/tlikonen

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (707 bytes) Download Attachment
Örjan Ekeberg Örjan Ekeberg
Reply | Threaded
Open this post in threaded view
|

Re: proposing "notmuch purge"

While I like the idea of making it easy to prune away old junk messages
from the mail store, I find it dangerously disruptive to suddenly change
the semantics of the deleted tag.  To me, the deleted tag has always
meant something like "I do not want to see this message again; unless it
reappears in a thread or I explicitly search for it".  The possibility
to undelete also means that deleting messages is not such a big deal.

What do you think about introducing a new tag, e.g. purge, and let
"notmuch purge" destructively remove the messages with this tag set?
Hopefully, nobody is using that particular tag for a different purpose.

Purging would then become a two-stage process; first tagging which
messages should be purged, before doing the actual non-reversible
removal.  This makes it simpler to check what would be purged before
actually doing it.

A dangerous but flexible way of configuration would be to have a
pre-purge-hook which could, for example, do things like:

    notmuch tag +purge "(" tag:deleted OR tag:spam ")" AND date:..30days

The downside of this is of course that hooks are not that easy to set up
and can easily backfire and possibly remove your entire mail collection.

/Örjan
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch