sup

A curses threads-with-tags style email client

sup.git

git clone https://supmua.dev/git/sup/
commit d836c1d8d069bbaf48b818dddbb72accaf3542d2
parent 04b6fa5bb629c63fad2a3741bbf5ade06014ba5b
Author: Dan Callaghan <djc@djc.id.au>
Date:   Sun, 19 Jul 2020 17:56:55 +1000

don't attempt to fix bad encoding in RFC2047 words

If we find an RFC2047-encoded header word with an unrecognised charset
name or invalid bytes, let's just display it as is without decoding.

This is one of the options recommended in the RFC for MUAs to handle
words which cannot be decoded successfully, and it's more likely to
produce something readable than forcibly reinterpreting the string as
UTF-8, which is what our String#transcode method was doing.

Diffstat:
M lib/sup/rfc2047.rb | 6 +++++-
M test/fixtures/rfc2047-header-encoding.eml | 2 ++
M test/test_message.rb | 3 ++-
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/lib/sup/rfc2047.rb b/lib/sup/rfc2047.rb
@@ -50,7 +50,11 @@ module Rfc2047
         # WORD.
       end
 
-      text.transcode(target, charset)
+      begin
+        text.force_encoding(charset).encode(target)
+      rescue ArgumentError, Encoding::InvalidByteSequenceError
+        word
+      end
     end
   end
 end
diff --git a/test/fixtures/rfc2047-header-encoding.eml b/test/fixtures/rfc2047-header-encoding.eml
@@ -5,7 +5,9 @@ Subject:
  =?US-ASCII?q?Hans Martin Djupvik?= =?ISO-8859-1?q?,_Ingrid_B=F8?=
  =?KOI8-R?b?LCDp0snOwSDzycTP0s/XwQ?=
  =?UTF-16?b?//4sACAASgBlAHMAcABlAHIAIABCAGUAcgBnAA?=
+ bad: =?UTF16?q?badcharsetname?= =?US-ASCII?b?/w?=
 
 The subject header contains various RFC2047 encoded words.
 For completeness we test both base64 and quoted-printable, and some
 ASCII-incompatible encodings.
+We also include some bogus words which cannot be decoded.
diff --git a/test/test_message.rb b/test/test_message.rb
@@ -240,7 +240,8 @@ class TestMessage < Minitest::Test
     sup_message = Message.build_from_source(source, source_info)
     sup_message.load_from_source!
 
-    assert_equal("Hans Martin Djupvik, Ingrid Bø, Ирина Сидорова, Jesper Berg",
+    assert_equal("Hans Martin Djupvik, Ingrid Bø, Ирина Сидорова, Jesper Berg " +
+                 "bad: =?UTF16?q?badcharsetname?==?US-ASCII?b?/w?=",
                  sup_message.subj)
   end