fix handling of RFC2047 words containing invalid bytes after decoding

commit b570d0cd16c9176cebf70739fd02ddf7f84a8c06
parent 4e23e13e705c02f9a341a04338b5b46994c01d24
Author: Dan Callaghan <djc@djc.id.au>
Date:   Sun, 12 Apr 2026 16:36:01 +1000

fix handling of RFC2047 words containing invalid bytes after decoding

Yet another bad case somehow not covered by all the tests. If the
encoding is recognised, and the word is successfully decoded, but the
resulting string contains an invalid byte sequence for its declared
encoding, Sup would crash with ArgumentError.

Only use the decoded word if the end result has a valid encoding.

Diffstat:

M	lib/sup/rfc2047.rb	\|	7	+++++--
M	test/fixtures/rfc2047-header-encoding.eml	\|	2	+-
M	test/test_message.rb	\|	1	+

3 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/lib/sup/rfc2047.rb b/lib/sup/rfc2047.rb
@@ -62,10 +62,13 @@ module Rfc2047
       end
 
       begin
-        text.force_encoding(charset).encode(target)
+        text.force_encoding charset
+        text.encode! target
       rescue ArgumentError, EncodingError
-        word
+        next word
       end
+      next word unless text.valid_encoding?
+      text
     end
   end
 end
diff --git a/test/fixtures/rfc2047-header-encoding.eml b/test/fixtures/rfc2047-header-encoding.eml
@@ -1,4 +1,4 @@
-From: test@example.invalid
+From: =?utf-8?q?YouTube-tj=E4nst?= <service@youtube.com>
 To: test@example.invalid
 Date: Sun, 19 Jul 2020 17:03:56 +1000
 Subject:
diff --git a/test/test_message.rb b/test/test_message.rb
@@ -253,6 +253,7 @@ class TestMessage < Minitest::Test
                  "bad: =?UTF16?q?badcharsetname?==?US-ASCII?b?/w?=" +
                  "=?UTF-7?Q?=41=6D=65=72=69=63=61=E2=80=99=73?=",
                  sup_message.subj)
+    assert_equal "=?utf-8?q?YouTube-tj=E4nst?=", sup_message.from.name
   end
 
   def test_nonascii_header

sup.git