BUG: "notmuch insert" fails with "Delivery of non-mail file"

classic Classic list List threaded Threaded
4 messages Options
Alvaro Herrera Alvaro Herrera
Reply | Threaded
Open this post in threaded view
|

BUG: "notmuch insert" fails with "Delivery of non-mail file"

Hello

I've been using notmuch successfully for a couple of years now (mostly
via neomutt).  Thanks for developing it.

Not long ago I switched my mail setup to use notmuch insert via
mailfilter instead of good old procmail.  However, since then a number
of emails are reported by notmuch as "non-mail", and appear to not be
indexed.  (I use --keep, so they're still in my maildir).

In my read of the code ultimately comes from
g_mime_parser_construct_message rejecting the message.
I reported this to GMime, and they said that the problem is that notmuch
insert is using the mbox mode:
https://github.com/jstedfast/gmime/issues/58
(Sample email is attached there).

As far as I can tell, this is all coming from
_notmuch_message_file_parse() which sets the is_mbox flag when it sees
the "^From " line at the start of the file ... which kinda makes sense
in general terms, but for notmuch-insert I think that's the wrong thing
to do.  Maybe a solution is to pass a flag down from notmuch-insert.c's
add_file all the way down to _notmuch_message_file_parse telling it not
to treat the file as an mbox.

I *think* that not all of the messages that fail parsing contain an
email attachment, so maybe I'll come back with further issues later on.
This is the first one I debugged.

Thanks

--
Álvaro Herrera
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
David Bremner-2 David Bremner-2
Reply | Threaded
Open this post in threaded view
|

Re: BUG: "notmuch insert" fails with "Delivery of non-mail file"

Alvaro Herrera <[hidden email]> writes:

> In my read of the code ultimately comes from
> g_mime_parser_construct_message rejecting the message.
> I reported this to GMime, and they said that the problem is that notmuch
> insert is using the mbox mode:
> https://github.com/jstedfast/gmime/issues/58
> (Sample email is attached there).

This issue (or a related one) has come up before

     https://nmbug.notmuchmail.org/nmweb/search/postfix+mbox

Generally it seems to be caused by tools that add mbox 'From ' headers,
without actually mbox escaping the file. We haven't yet reached
consensus on a good solution (generally people just want to fix their
own mail, which is understandable). A workaround discussed in the
messages I reference above is to strip the 'From ' header before passing
to notmuch-insert. Perhaps some scholar of the RFCs can convince us that
that is "always" the right thing for notmuch insert to do.

> As far as I can tell, this is all coming from
> _notmuch_message_file_parse() which sets the is_mbox flag when it sees
> the "^From " line at the start of the file ... which kinda makes sense
> in general terms, but for notmuch-insert I think that's the wrong thing
> to do.  Maybe a solution is to pass a flag down from notmuch-insert.c's
> add_file all the way down to _notmuch_message_file_parse telling it not
> to treat the file as an mbox.
>

I'd be worried about letting notmuch-insert deliver messages that
notmuch-new would not be able to parse. In particular we'd like to keep
the property that a Maildir + the output of notmuch-dump should be
enough to completely recover the notmuch database.
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Alvaro Herrera Alvaro Herrera
Reply | Threaded
Open this post in threaded view
|

Re: BUG: "notmuch insert" fails with "Delivery of non-mail file"

Hi David, thanks for replying.

On 2019-Jan-19, David Bremner wrote:

> Alvaro Herrera <[hidden email]> writes:
>
> > In my read of the code ultimately comes from
> > g_mime_parser_construct_message rejecting the message.
> > I reported this to GMime, and they said that the problem is that notmuch
> > insert is using the mbox mode:
> > https://github.com/jstedfast/gmime/issues/58
> > (Sample email is attached there).
>
> This issue (or a related one) has come up before
>
>      https://nmbug.notmuchmail.org/nmweb/search/postfix+mbox
>
> Generally it seems to be caused by tools that add mbox 'From ' headers,
> without actually mbox escaping the file. We haven't yet reached
> consensus on a good solution (generally people just want to fix their
> own mail, which is understandable). A workaround discussed in the
> messages I reference above is to strip the 'From ' header before passing
> to notmuch-insert. Perhaps some scholar of the RFCs can convince us that
> that is "always" the right thing for notmuch insert to do.

I'm not sure I follow.  As I understand, notmuch does not work with
mboxes, only with maildirs, so the behavior of splitting emails at "From
" is not strictly necessary, since one file always equals one message.

As for RFC scholarship, I spent some time looking at
https://tools.ietf.org/html/rfc5322 to see if it defined any sort of
message separator ... but as far as I can tell, it only defines what
does a valid message looks like.  It doesn't say where does one message
end.

On the other hand, in my world, it's been quite a while since 'From '
was considered a useful message separator.  This stopped being true in a
pretty extensive way when git-format-patches messages started being
posted as attachments.  But even before that, MUAs stopped adding the
">" at the start of a "From " line in human-written text.  Nowadays what
really governs the split is the Content-Length header, from the MIME
definitions.  Most tools do not escape lines starting with 'From '
anymore.  As far as I can tell, this is defined by RFC-2049,
https://tools.ietf.org/html/rfc2046#section-5.1.1 which states that the
implementation must look for the "boundary delimitir line".  Stopping at
a "From " line before finding the boundary delimiter line would be a
mistake, in my reading.

> > As far as I can tell, this is all coming from
> > _notmuch_message_file_parse() which sets the is_mbox flag when it sees
> > the "^From " line at the start of the file ... which kinda makes sense
> > in general terms, but for notmuch-insert I think that's the wrong thing
> > to do.  Maybe a solution is to pass a flag down from notmuch-insert.c's
> > add_file all the way down to _notmuch_message_file_parse telling it not
> > to treat the file as an mbox.
>
> I'd be worried about letting notmuch-insert deliver messages that
> notmuch-new would not be able to parse. In particular we'd like to keep
> the property that a Maildir + the output of notmuch-dump should be
> enough to completely recover the notmuch database.

Hmm, that's a good point -- I assume that notmuch-new should be patched
similarly so that those messages are valid there too.

So maybe the solution (given that, as I said above, Notmuch does not
appear to handle mboxes at all) is to just set the mbox flag to false
completely ...

--
Álvaro Herrera                PostgreSQL Expert, https://www.2ndQuadrant.com/
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
David Bremner-2 David Bremner-2
Reply | Threaded
Open this post in threaded view
|

Re: BUG: "notmuch insert" fails with "Delivery of non-mail file"

Alvaro Herrera <[hidden email]> writes:

> I'm not sure I follow.  As I understand, notmuch does not work with
> mboxes, only with maildirs, so the behavior of splitting emails at "From
> " is not strictly necessary, since one file always equals one message.

Checking for mboxes was added as a safety feature since people found
indexing large mboxes led to bad results (bloated index, crashing
indexer, etc...).

> On the other hand, in my world, it's been quite a while since 'From '
> was considered a useful message separator.  This stopped being true in a
> pretty extensive way when git-format-patches messages started being
> posted as attachments.

Sure. Things on disk should either be mboxes, or not. If they start with
'From ', they are mboxes.  We attempted to take away support for single
message mboxes, but people complained even more about  that. So
generally, if tools / users don't want to escape 'From ' after the first
line, the first line should not be 'From '.

My original question was whether notmuch-insert should strip the 'From '
(and presumbly save as a normal header) before delivery.

d


_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch