[PATCH 0/2] lib: signature content type indexing

classic Classic list List threaded Threaded
7 messages Options
Jani Nikula Jani Nikula
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/2] lib: signature content type indexing

"I find your lack of tests disturbing."

Anyway, better patches on the list than in my local tree, with or
without tests.

BR,
Jani.


Jani Nikula (2):
  lib: abstract content type indexing
  lib: index the content type of signature parts

 lib/index.cc | 37 ++++++++++++++++++++++++-------------
 1 file changed, 24 insertions(+), 13 deletions(-)

--
2.11.0

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Jani Nikula Jani Nikula
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/2] lib: abstract content type indexing

Make the follow-up change of indexing signature content types
easier. No functional changes.
---
 lib/index.cc | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/lib/index.cc b/lib/index.cc
index 2b98b588d38d..64bc92a5b56d 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -351,6 +351,19 @@ _index_address_list (notmuch_message_t *message,
     }
 }
 
+static void
+_index_content_type (notmuch_message_t *message, GMimeObject *part)
+{
+    GMimeContentType *content_type = g_mime_object_get_content_type (part);
+    if (content_type) {
+ char *mime_string = g_mime_content_type_to_string (content_type);
+ if (mime_string) {
+    _notmuch_message_gen_terms (message, "mimetype", mime_string);
+    g_free (mime_string);
+ }
+    }
+}
+
 /* Callback to generate terms for each mime part of a message. */
 static void
 _index_mime_part (notmuch_message_t *message,
@@ -361,6 +374,7 @@ _index_mime_part (notmuch_message_t *message,
     GMimeDataWrapper *wrapper;
     GByteArray *byte_array;
     GMimeContentDisposition *disposition;
+    GMimeContentType *content_type;
     char *body;
     const char *charset;
 
@@ -370,15 +384,7 @@ _index_mime_part (notmuch_message_t *message,
  return;
     }
 
-    GMimeContentType *content_type = g_mime_object_get_content_type(part);
-    if (content_type) {
- char *mime_string = g_mime_content_type_to_string(content_type);
- if (mime_string)
- {
-    _notmuch_message_gen_terms (message, "mimetype", mime_string);
-    g_free(mime_string);
- }
-    }
+    _index_content_type (message, part);
 
     if (GMIME_IS_MULTIPART (part)) {
  GMimeMultipart *multipart = GMIME_MULTIPART (part);
@@ -447,6 +453,8 @@ _index_mime_part (notmuch_message_t *message,
     g_mime_stream_mem_set_owner (GMIME_STREAM_MEM (stream), FALSE);
 
     filter = g_mime_stream_filter_new (stream);
+
+    content_type = g_mime_object_get_content_type (part);
     discard_non_term_filter = notmuch_filter_discard_non_term_new (content_type);
 
     g_mime_stream_filter_add (GMIME_STREAM_FILTER (filter),
--
2.11.0

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Jani Nikula Jani Nikula
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/2] lib: index the content type of signature parts

In reply to this post by Jani Nikula
It's useful (*) to be able to easily find messages with certain types
of signatures. Having the mimetype: prefix searches fail for some
content types is also genuinely surprising (*). Index the content type
of signature parts.

While at it, switch to the gmime convenience constants for content and
signature part indexes.

*) At least for developers of email software!
---
 lib/index.cc | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/lib/index.cc b/lib/index.cc
index 64bc92a5b56d..0beaae62f048 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -398,12 +398,15 @@ _index_mime_part (notmuch_message_t *message,
 
  for (i = 0; i < g_mime_multipart_get_count (multipart); i++) {
     if (GMIME_IS_MULTIPART_SIGNED (multipart)) {
- /* Don't index the signature. */
- if (i == 1)
+ /* Don't index the signature, but index its content type. */
+ if (i == GMIME_MULTIPART_SIGNED_SIGNATURE) {
+    _index_content_type (message,
+ g_mime_multipart_get_part (multipart, i));
     continue;
- if (i > 1)
+ } else if (i != GMIME_MULTIPART_SIGNED_CONTENT) {
     _notmuch_database_log (_notmuch_message_database (message),
-  "Warning: Unexpected extra parts of multipart/signed. Indexing anyway.\n");
+   "Warning: Unexpected extra parts of multipart/signed. Indexing anyway.\n");
+ }
     }
     if (GMIME_IS_MULTIPART_ENCRYPTED (multipart)) {
  /* Don't index encrypted parts. */
--
2.11.0

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/2] lib: index the content type of signature parts

On Wed 2017-09-13 22:13:35 +0300, Jani Nikula wrote:
> It's useful (*) to be able to easily find messages with certain types
> of signatures. Having the mimetype: prefix searches fail for some
> content types is also genuinely surprising (*). Index the content type
> of signature parts.
>
> While at it, switch to the gmime convenience constants for content and
> signature part indexes.
>
> *) At least for developers of email software!

this series of 2 patches lgtm.

     --dkg

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (847 bytes) Download Attachment
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

[PATCH] lib: index the content-type of the parts of encrypted messages

This is a logical followup to "lib: index the content type of
signature parts", which will make it easier to record the message
structure of all messages.
---
 lib/index.cc | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/index.cc b/lib/index.cc
index 0beaae62..ceb444df 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -409,7 +409,14 @@ _index_mime_part (notmuch_message_t *message,
  }
     }
     if (GMIME_IS_MULTIPART_ENCRYPTED (multipart)) {
- /* Don't index encrypted parts. */
+ /* Don't index encrypted parts, but index their content type. */
+ _index_content_type (message,
+     g_mime_multipart_get_part (multipart, i));
+ if ((i != GMIME_MULTIPART_ENCRYPTED_VERSION) &&
+    (i != GMIME_MULTIPART_ENCRYPTED_CONTENT)) {
+    _notmuch_database_log (_notmuch_message_database (message),
+   "Warning: Unexpected extra parts of multipart/encrypted.\n");
+ }
  continue;
     }
     _index_mime_part (message,
--
2.14.1

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Jani Nikula Jani Nikula
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] lib: index the content-type of the parts of encrypted messages

On Fri, Sep 15, 2017 at 8:38 AM, Daniel Kahn Gillmor
<[hidden email]> wrote:

> This is a logical followup to "lib: index the content type of
> signature parts", which will make it easier to record the message
> structure of all messages.
> ---
>  lib/index.cc | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/lib/index.cc b/lib/index.cc
> index 0beaae62..ceb444df 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -409,7 +409,14 @@ _index_mime_part (notmuch_message_t *message,
>                 }
>             }
>             if (GMIME_IS_MULTIPART_ENCRYPTED (multipart)) {
> -               /* Don't index encrypted parts. */
> +               /* Don't index encrypted parts, but index their content type. */
> +               _index_content_type (message,
> +                                    g_mime_multipart_get_part (multipart, i));
> +               if ((i != GMIME_MULTIPART_ENCRYPTED_VERSION) &&
> +                   (i != GMIME_MULTIPART_ENCRYPTED_CONTENT)) {

Nitpick, the extra braces aren't needed here. But the patch does what
it says on the box.

I was first wondering about the usefulness of indexing
"application/octet-stream" for the encrypted content parts, but then I
think it's good for completeness.

BR,
Jani.

> +                   _notmuch_database_log (_notmuch_message_database (message),
> +                                          "Warning: Unexpected extra parts of multipart/encrypted.\n");
> +               }
>                 continue;
>             }
>             _index_mime_part (message,
> --
> 2.14.1
>
> _______________________________________________
> notmuch mailing list
> [hidden email]
> https://notmuchmail.org/mailman/listinfo/notmuch
_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch
Daniel Kahn Gillmor Daniel Kahn Gillmor
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] lib: index the content-type of the parts of encrypted messages

On Fri 2017-09-15 10:48:08 +0300, Jani Nikula wrote:
> Nitpick, the extra braces aren't needed here. But the patch does what
> it says on the box.

Thanks for the review!

I prefer to keep the braces, i think they make it clearer what's
happening.  If whoever's merging prefers to remove the braces, i'd be
willing to accept that too.

This section is also updated in my cleartext-index series, which depends
on this cleanup series, so removing the braces will require yet another
rebase of that series, which is what i'd really like to get to.

> I was first wondering about the usefulness of indexing
> "application/octet-stream" for the encrypted content parts, but then I
> think it's good for completeness.

It would also make it possible to scan a maildir for encrypted messages
that *don't* have application/octet-stream for example, to see what is
going on there.

      --dkg

_______________________________________________
notmuch mailing list
[hidden email]
https://notmuchmail.org/mailman/listinfo/notmuch

signature.asc (847 bytes) Download Attachment