notmuch and public-inbox

classic Classic list List threaded Threaded
5 messages Options
Felipe Contreras Felipe Contreras
Reply | Threaded
Open this post in threaded view
|

notmuch and public-inbox

Hi,

My workflow with notmuch is near to perfect, however, the only pain
point I have is fetching all the mail of a particular mailing list.

To do this efficiently public-inbox seems ideal, however, when
searching information to link notmuch to public-inbox I don't find
anything of value. In fact, I can't find an URL of a public-inbox
repository of the notmuch mailing list.

Am I missing something or has nobody really worked on linking these two
tools? Seems like an obvious area of opportunity.

Cheers.

--
Felipe Contreras
_______________________________________________
notmuch mailing list -- [hidden email]
To unsubscribe send an email to [hidden email]
Eric Wong Eric Wong
Reply | Threaded
Open this post in threaded view
|

Re: notmuch and public-inbox

Felipe Contreras <[hidden email]> wrote:
> Hi,
>
> My workflow with notmuch is near to perfect, however, the only pain
> point I have is fetching all the mail of a particular mailing list.
>
> To do this efficiently public-inbox seems ideal, however, when
> searching information to link notmuch to public-inbox I don't find
> anything of value. In fact, I can't find an URL of a public-inbox
> repository of the notmuch mailing list.

Kyle maintains an unofficial mirror at https://yhetil.org/notmuch

There's no real relationship between them aside from they both
use Xapian (and I learned Xapian from reading the notmuch source).

> Am I missing something or has nobody really worked on linking these two
> tools? Seems like an obvious area of opportunity.

I think W. Trevor King (Cc-ed) also started looking something
many years ago, but I'm not sure if anything became of it.

I never had the interest in using notmuch since Maildirs are a
non-starter with millions of messages with current FSes/OSes.
mairix + gzipped mboxes mostly works for me, (though mairix
indexing is silly expensive[1])


[1] also, there's a footnote about something on the git list
    which wouldn't be appropriate to discuss here on the
    notmuch list.
_______________________________________________
notmuch mailing list -- [hidden email]
To unsubscribe send an email to [hidden email]
Felipe Contreras Felipe Contreras
Reply | Threaded
Open this post in threaded view
|

Re: notmuch and public-inbox

On Fri, Apr 30, 2021 at 7:05 PM Eric Wong <[hidden email]> wrote:

>
> Felipe Contreras <[hidden email]> wrote:
> > My workflow with notmuch is near to perfect, however, the only pain
> > point I have is fetching all the mail of a particular mailing list.
> >
> > To do this efficiently public-inbox seems ideal, however, when
> > searching information to link notmuch to public-inbox I don't find
> > anything of value. In fact, I can't find an URL of a public-inbox
> > repository of the notmuch mailing list.
>
> Kyle maintains an unofficial mirror at https://yhetil.org/notmuch

Nice. Who is Kyle?

> There's no real relationship between them aside from they both
> use Xapian (and I learned Xapian from reading the notmuch source).

I don't mean sharing the Xapian database (although that could be
interesting for the future). I'm talking about as a client of
public-inbox, not as a server.

I mean doing a git clone for a public-inbox repository and notmuch
indexing that repository.

> > Am I missing something or has nobody really worked on linking these two
> > tools? Seems like an obvious area of opportunity.
>
> I think W. Trevor King (Cc-ed) also started looking something
> many years ago, but I'm not sure if anything became of it.
>
> I never had the interest in using notmuch since Maildirs are a
> non-starter with millions of messages with current FSes/OSes.
> mairix + gzipped mboxes mostly works for me, (though mairix
> indexing is silly expensive[1])

If notmuch was patched to support the public-inbox format--as an
alternative to Maildir--then users of public-inbox could clone a
repository, and use notmuch to index that.

I don't see how that could be difficult. But then again, I haven't
looked at the Maildir code.

Cheers.

--
Felipe Contreras
_______________________________________________
notmuch mailing list -- [hidden email]
To unsubscribe send an email to [hidden email]
Eric Wong Eric Wong
Reply | Threaded
Open this post in threaded view
|

Re: notmuch and public-inbox

In reply to this post by Eric Wong
Carl Worth <[hidden email]> wrote:

> On Sat, May 01 2021, Eric Wong wrote:
> > I never had the interest in using notmuch since Maildirs are a
> > non-starter with millions of messages with current FSes/OSes.
>
> What bottleneck are you seeing here?
>
> I don't have million(s) of messages but I'm getting close with 1.48M
> messages in my current notmuch index.
>
> I'm not seeing any problematic performance from the filesystem or OS
> myself, so I'm curious what problem you're referring to here.

I assume you have several Maildirs and not just one with 1.48M?

Since I never actually used notmuch myself; most of my aversion
comes from years of using Maildir sync tools (mbsync,
offlineimap, rsync).  They all struggle with many inodes
and syscalls + cache required to walk them.

It's the same reason git puts old objects in packfiles rather
than having millions of loose objects.

Furthermore, my MUA (mutt) struggles on a single Maildir when
its size goes over ~50K.  Maildir is fine as a dumping ground
for mairix search results (typically a few dozen/hundred results).

Maildir is better nowadays on FSes with compression and
checksums; but lack of compression and checksumming were also
points against it; though syscalls are also more expensive with
CPU vulnerability mitigations.

I've always gzipped my archival mboxes for compression and CRC.

My local mirror of all the messages on lore.kernel.org/* is over
14.6M(*) and growing...  (LKML is 4M of that).


(*) 14.6M in the new combined "extindex" format that should be on
    lore.kernel.org, soon.  For now, I have an experimental
    instance on https://yhbt.net/lore/all/
_______________________________________________
notmuch mailing list -- [hidden email]
To unsubscribe send an email to [hidden email]
Eric Wong Eric Wong
Reply | Threaded
Open this post in threaded view
|

Re: notmuch and public-inbox

In reply to this post by Felipe Contreras
Felipe Contreras <[hidden email]> wrote:

> On Fri, Apr 30, 2021 at 7:05 PM Eric Wong <[hidden email]> wrote:
> >
> > Felipe Contreras <[hidden email]> wrote:
> > > My workflow with notmuch is near to perfect, however, the only pain
> > > point I have is fetching all the mail of a particular mailing list.
> > >
> > > To do this efficiently public-inbox seems ideal, however, when
> > > searching information to link notmuch to public-inbox I don't find
> > > anything of value. In fact, I can't find an URL of a public-inbox
> > > repository of the notmuch mailing list.
> >
> > Kyle maintains an unofficial mirror at https://yhetil.org/notmuch
>
> Nice. Who is Kyle?

A notmuch user and public-inbox user/contributor; beyond that I
don't know.

public-inbox is all designed so anybody can make mirrors of any
mail they have.  (as I've mirrored a bunch of lists myself
without ever asking permission)

> > There's no real relationship between them aside from they both
> > use Xapian (and I learned Xapian from reading the notmuch source).
>
> I don't mean sharing the Xapian database (although that could be
> interesting for the future). I'm talking about as a client of
> public-inbox, not as a server.
>
> I mean doing a git clone for a public-inbox repository and notmuch
> indexing that repository.

Ah, the git repository formats are documented at:

        https://public-inbox.org/public-inbox-v2-format.html
        https://public-inbox.org/public-inbox-v1-format.html

> > > Am I missing something or has nobody really worked on linking these two
> > > tools? Seems like an obvious area of opportunity.
> >
> > I think W. Trevor King (Cc-ed) also started looking something
> > many years ago, but I'm not sure if anything became of it.
> >
> > I never had the interest in using notmuch since Maildirs are a
> > non-starter with millions of messages with current FSes/OSes.
> > mairix + gzipped mboxes mostly works for me, (though mairix
> > indexing is silly expensive[1])
>
> If notmuch was patched to support the public-inbox format--as an
> alternative to Maildir--then users of public-inbox could clone a
> repository, and use notmuch to index that.
>
> I don't see how that could be difficult. But then again, I haven't
> looked at the Maildir code.

That would be cool; always room for more tools to interoperate
with each other.  (I'm quite busy with public-inbox and trying
to avoid AOT languages as much as possible).

Keep in mind some users are already happy with l2md and impibe
for writing Maildir, so there's already (space inefficient ways)
to make notmuch index data from public-inboxes:

* l2md - Maildir and procmail importer using C + libgit2
  https://git.kernel.org/pub/scm/linux/kernel/git/dborkman/l2md.git

* impibe - Perl script to import v1 or v2 to Maildir
  https://leahneukirchen.org/dotfiles/bin/impibe
  discussion: https://public-inbox.org/meta/87v9m0l8t1.fsf@.../

(maybe more will appear at <https://public-inbox.org/clients.html>)
_______________________________________________
notmuch mailing list -- [hidden email]
To unsubscribe send an email to [hidden email]