#10974 closed defect (fixed)
sending unicode email fails
Reported by: | anonymous | Owned by: | Ryan J Ollos |
---|---|---|---|
Priority: | normal | Component: | AnnouncerPlugin |
Severity: | normal | Keywords: | |
Cc: | Stephan Geulette, Dmitri | Trac Release: | 1.0 |
Description
The AnnouncerPlugin fails to send unicode emails for me. I get the following traceback:
Traceback (most recent call last): File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/api.py", line 584, in _real_send evt) File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 330, in distribute self._do_send(transport, event, k, v, fmtdict[k]) File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 488, in _do_send msgText = MIMEText(output, msg_format) File "/usr/lib/python2.7/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/usr/lib/python2.7/email/message.py", line 226, in set_payload self.set_charset(charset) File "/usr/lib/python2.7/email/message.py", line 262, in set_charset self._payload = self._payload.encode(charset.output_charset) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 32: ordinal not in range(128)
I'm using the mime_encoding = base64 and the plain text email format (but it seems to happen for html email as well). I'm on Python 2.7.3 and Trac 1.0.1.
Attachments (2)
Change History (25)
comment:1 Changed 12 years ago by
comment:2 follow-up: 3 Changed 12 years ago by
Hi, Would you mind trying out this (untested) patch?:
-
announcerplugin/trunk/announcer/distributors/mail.py
diff --git a/announcerplugin/trunk/announcer/distributors/mail.py b/announcerplu index b994502..bd79d1f 100644
a b class EmailDistributor(Component): 478 478 rootMessage.attach(parentMessage) 479 479 480 480 alt_msg_format = 'html' in alternate_style and 'html' or 'plain' 481 msgText = MIMEText(alternate_output, alt_msg_format) 482 msgText.set_charset(self._charset) 481 msgText = MIMEText(alternate_output, alt_msg_format, self._charset) 483 482 parentMessage.attach(msgText) 484 483 else: 485 484 parentMessage = rootMessage 486 485 487 486 msg_format = 'html' in format and 'html' or 'plain' 488 msgText = MIMEText(output, msg_format )487 msgText = MIMEText(output, msg_format, self._charset) 489 488 del msgText['Content-Transfer-Encoding'] 490 msgText.set_charset(self._charset)491 489 # According to RFC 2046, the last part of a multipart message is best 492 490 # and preferred. 493 491 parentMessage.attach(msgText)
Based on documentation for MIMEText, we may need to specify the charset
in the constructor.
comment:3 follow-up: 4 Changed 12 years ago by
Thanks for your quick response. I'm fine with testing your suggestion. Here's the result:
Traceback (most recent call last): File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/api.py", line 584, in _real_send evt) File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 330, in distribute self._do_send(transport, event, k, v, fmtdict[k]) File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 490, in _do_send msgText = MIMEText(output, msg_format, self._charset) File "/usr/lib/python2.7/email/mime/text.py", line 29, in __init__ **{'charset': _charset}) File "/usr/lib/python2.7/email/mime/base.py", line 25, in __init__ self.add_header('Content-Type', ctype, **_params) File "/usr/lib/python2.7/email/message.py", line 408, in add_header parts.append(_formatparam(k.replace('_', '-'), v)) File "/usr/lib/python2.7/email/message.py", line 45, in _formatparam if value is not None and len(value) > 0:
(Regarding the line numbers I shall add, that I didn't remove the other lines but commented them out.)
I did some further debugging, which is hopefully useful to you: self._charset
is an instance of email.charset.Charset
having the following attributes (extracted by printing self._charset.__dict__
to the log file): {'input_codec': 'utf-8', 'body_encoding': 2, 'input_charset': 'utf-8', 'header_encoding': 2, 'output_charset': 'utf-8', 'output_codec': 'utf-8'}
. As the python documentation of the MIMEText constructor says that the third parameter defaults to us-ascii
, I just tried the string utf-8
, which however creates a broken email like this:
Return-path: <trac@wobsta.de> Envelope-to: contact@wobsta.de Delivery-date: Wed, 27 Mar 2013 14:27:38 +0100 Received: from localhost ([127.0.0.1] helo=h2032560.stratoserver.net) by h2032560.stratoserver.net with esmtp (Exim 4.76) (envelope-from <trac@wobsta.de>) id 1UKqO9-0008MK-Tl for contact@wobsta.de; Wed, 27 Mar 2013 14:27:37 +0100 Content-Type: multipart/related; boundary="===============2509084458330105465==" MIME-Version: 1.0 Date: Wed, 27 Mar 2013 13:27:37 -0000 To: "undisclosed-recipients:" Reply-To: trac@wobsta.de Message-ID: <041.feac52a1e43c7efa8e99991f7bd1ff00@wobsta.de> From: "wobsta.de" <trac@wobsta.de> Subject: Blog: rundfunkgebuehren2 comment deleted Auto-Submitted: auto-generated Precedence: bulk X-Announcer-Version: 1.0dev-r12503 X-Mailer: AnnouncerPlugin v1.0dev-r12503 on Trac v1.0.1 X-Trac-Announcement-Realm: blog X-Trac-Project: wobsta.de X-Trac-Version: 1.0.1 X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: trac@wobsta.de X-SA-Exim-Scanned: No (on h2032560.stratoserver.net); SAEximRunCond expanded to false This is a multi-part message in MIME format. --===============2509084458330105465== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 STNKMWJtUm1kVzVyWjJWaWRXVm9jbVZ1TWpvZ1VuVnVaR1oxYm10blpXTER2R2h5Wlc0Z1pzTzhj aUJEYjIxd2RYUmxjaUJwYmlCbAphVzVsYlNCSVpXbHRZc084Y204c0lGWmxjbWhoYm1Sc2RXNW5J SFZ1WkNCRmJuUnpZMmhsYVdSMWJtY2dZbVZwYlNCQ2RXNWtaWE4yClpYSjNZV3gwZFc1bmMyZGxj bWxqYUhRS0Nnb0sK --===============2509084458330105465==--
which displays as some random garbage (looks like base64 or so) in the mailer.
However, if I encode output
using utf-8
, i.e. use msgText = MIMEText(output.encode('utf-8'), msg_format)
, everything looks fine. Maybe the correct code should thus be msgText = MIMEText(output.encode(self._charset.input_codec), msg_format)
, but I'm not sure.
For the curious, as you asked me, this is the announcer section of my trac.ini:
[announcer] default_email_format = text/plain email_address_resolvers = SpecifiedEmailResolver, SessionEmailResolver, DefaultDomainEmailResolver email_enabled = true email_from = trac@wobsta.de email_from_name = email_replyto = trac@wobsta.de email_sender = SmtpEmailSender email_subject_prefix = __default__ email_to = undisclosed-recipients: mime_encoding = base64 use_public_cc = false use_threaded_delivery = false
comment:4 Changed 12 years ago by
Replying to anonymous:
... However, if I encode
output
usingutf-8
, i.e. usemsgText = MIMEText(output.encode('utf-8'), msg_format)
, everything looks fine. Maybe the correct code should thus bemsgText = MIMEText(output.encode(self._charset.input_codec), msg_format)
, but I'm not sure.
That seems like a good workaround for now. It seems like it should be possible to specify the encoding when constructing the MIMEText
object, but the documentation was unclear, and I made a bit of an assumption when deciding to pass a Charset
object as the third parameter.
Thank you for debugging and providing all of this info. I'll do some more testing in the next day or so, but if all else fails, we can just encode the output
that is passed to the MIMEText
constructor like you did.
comment:5 Changed 12 years ago by
I'm fine with my patched version for now, but I really want to emphasize, that it needs fixing upstream. The reason is, that its a really odd failure, if you don't monitor your system in detail. Let me explain: In case you just setup trac with the annoucer plugin and test it with some simple message (ascii only), it pretty much works like a charm. (So first of all thanks for the great plugin, its awesome.)
I just did so as well, but I was lucky to test it with some real data, which fortunately happened to contain non-ascii characters (as it will be rather common in some of my use-cases). Anyway, the problem is, that everything still looks rather ok, just the email is not sent at all. If you turn on logging in trac, you will find the traceback in the logs, but for the person how triggered the email everything looks fine. You do not receive an error and the item is updated online properly. So in the end of the day, people will just not receive their notification emails, without anybody to understand why this happens! I guess you agree, that this is a really serious issue.
I was just poking around, whether my solution is kind of correct. I agree that the Python documentation is not really useful here. (It's a shame, actually.) However, I found a blog message about it: http://mg.pov.lt/blog/unicode-emails-in-python ... which supports my solution. On the other hand, people have been disussing the problem elsewhere as well: see ht tp://bugs.python.org/issue1368247 ... but it looks like it is still not applied on python2.7. On my system Python still does not contain the suggested patch (ht tp://bugs.python.org/file12190/mimetext-unicode.patch). I'm on Ubuntu 12.04 LTS using the python from the distrubution, which happens to be Python 2.7.3. I was just lucky to try such an encoding myself, however, using self._charset.input_codec
, which just happen to be utf-8
in my case as well. Probably self._charset.output_charset
is the right thing (although I wonder why it should be the output charset, as the output
in the announcer code is kind of the input to the email system, but I just might be confused about the terminology.
comment:6 Changed 12 years ago by
PS: I'm sorry for posting broken links. The spam filter complained about my reply being spam due to too many external links, so I slightly broke the link format. I'm sorry for that workaround (instead of createing an account for myself, which probably would have worked around the spam filter as well).
comment:7 Changed 11 years ago by
Cc: | Stephan Geulette added; anonymous removed |
---|
#11227 closed as a duplicate. We should consider applying the patch in this ticket.
comment:8 Changed 11 years ago by
use AnnouncerPlugin from trunk (r13373), and still got this error
2013-09-02 23:45:45,425 Trac[api] ERROR: AnnouncementSystem failed. Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/announcer/api.py", line 584, in _real_send evt) File "/usr/lib/python2.5/site-packages/announcer/distributors/mail.py", line 330, in distribute self._do_send(transport, event, k, v, fmtdict[k]) File "/usr/lib/python2.5/site-packages/announcer/distributors/mail.py", line 490, in _do_send msgText.set_charset(self._charset) File "/usr/lib/python2.5/email/message.py", line 262, in set_charset self._payload = charset.body_encode(self._payload) File "/usr/lib/python2.5/email/charset.py", line 384, in body_encode return email.base64mime.body_encode(s) File "/usr/lib/python2.5/email/base64mime.py", line 148, in encode enc = b2a_base64(s[i:i + max_unencoded]) UnicodeEncodeError: 'ascii' codec can't encode characters in position 6-13: ordinal not in range(128)
Trac 1.0.1
workaround with MIMEText(output.encode('utf-8'), msg_format)
helps
comment:9 Changed 11 years ago by
Looks like Trac's notification system fixed this problem in trac:changeset:10176 by letting Genshi encode the body: trac:source:trunk/trac/notification.py@12085:416,474#L409
Announcer could do this in each formatter: source:announcerplugin/trunk/announcer/formatters.py@12359:152,248,318#L248
comment:11 Changed 11 years ago by
Replying to rjollos:
Issue was raised again on the mailing list.
And suddenly it appears again. No changes to the system. But e-mail stops. I get this in the Trac log:
2013-12-03 11:52:47,761 Trac[api] ERROR: AnnouncementSystem failed. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r0-py2.7.egg/announcer/api.py", line 584, in _real_send evt) File "/usr/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r0-py2.7.egg/announcer/distributors/mail.py", line 330, in distribute self._do_send(transport, event, k, v, fmtdict[k]) File "/usr/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r0-py2.7.egg/announcer/distributors/mail.py", line 481, in _do_send msgText = MIMEText(alternate_output, alt_msg_format) File "/usr/lib/python2.7/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/usr/lib/python2.7/email/message.py", line 226, in set_payload self.set_charset(charset) File "/usr/lib/python2.7/email/message.py", line 262, in set_charset self._payload = self._payload.encode(charset.output_charset) UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 472: ordinal not in range(128)
I have made the edit suggested in 8. Odd that it seemed to solve the problem for a bit, and then the problem returns.
comment:13 Changed 11 years ago by
I ran into exactly the same problem.
It occurred out of the blue after no issues whatsoever for approx. 30 tickets with various comments and modifications along the way. Then all of sudden this issue shows up. My error message concerned a white space:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u200b' in position 4639: ordinal not in range(128)
The fact that it occurred after some time makes me wonder if this is somehow tied to a counter that is being incremented with each ticket creation, modification, comment etc...
At any rate I applied the patch implied in the comment above and it corrected the problem for me. At least for the time being. See: attachment:formatters_patch.diff. And I apologize that the description of the attachment is linking to an incorrect ticket.
comment:14 Changed 11 years ago by
Cc: | Dmitri added |
---|
Changed 10 years ago by
Attachment: | sending-unicode-email-fails.patch added |
---|
changed patch from comment:3 which worked for me
comment:15 follow-up: 16 Changed 10 years ago by
I tried the patch from comment:3 (with output.encode(self._charset.input_codec)
): exception disappeared but now all parts have charset="us-ascii"
in Content-Type
header. It seems that lines with msgText.set_charset(self._charset)
should be retained.
attachment:formatters_patch.diff didn't help at all.
attachment:sending-unicode-email-fails.patch works for me.
comment:16 follow-up: 17 Changed 10 years ago by
Replying to g1itch:
I tried the patch from comment:3 (with
output.encode(self._charset.input_codec)
): exception disappeared but now all parts havecharset="us-ascii"
inContent-Type
header. It seems that lines withmsgText.set_charset(self._charset)
should be retained.attachment:formatters_patch.diff didn't help at all.
attachment:sending-unicode-email-fails.patch works for me.
I had similar problem, and attachment:sending-unicode-email-fails.patch also works for me.
Could you push it to trunk?
Traceback (most recent call last): File "build/bdist.linux-x86_64/egg/announcer/api.py", line 584, in _real_send evt) File "build/bdist.linux-x86_64/egg/announcer/distributors/mail.py", line 330, in distribute self._do_send(transport, event, k, v, fmtdict[k]) File "build/bdist.linux-x86_64/egg/announcer/distributors/mail.py", line 488, in _do_send msgText = MIMEText(output, msg_format) File "/usr/lib/python2.7/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/usr/lib/python2.7/email/message.py", line 226, in set_payload self.set_charset(charset) File "/usr/lib/python2.7/email/message.py", line 262, in set_charset self._payload = self._payload.encode(charset.output_charset) UnicodeEncodeError: 'ascii' codec can't encode character u'\u015b' in position 168: ordinal not in range(128)
comment:17 Changed 10 years ago by
Replying to dskrzypczak:
I had similar problem, and attachment:sending-unicode-email-fails.patch also works for me.
Could you push it to trunk?
Thanks for the positive test feedback. I'll do.
comment:19 Changed 9 years ago by
Owner: | changed from Steffen Hoffmann to Ryan J Ollos |
---|---|
Status: | new → accepted |
comment:22 Changed 9 years ago by
The changeset contains an error in line 481:
if isinstance(alternative_output, unicode):
should be:
if isinstance(alternate_output, unicode):
I'm a bit surprised by this. Could you paste the
[announcer]
section from yourtrac.ini
file just so we can double-check.