Author |
|
smigs Newbie
Joined: 24 February 2014 Location: United Kingdom
Online Status: Offline Posts: 3
|
Posted: 24 February 2014 at 3:48am | IP Logged
|
|
|
Is there a way to tell MailBee to use the charset from a HTML meta tag (if there is one) to interpret a message part in, or to have it guess the charset, ignoring what the mime headers say? I know when using MailMessage.LoadBodyText() you can set ImportBodyOptions.PreferCharsetFromMetaTag, but I'm downloading these messages via IMAP, so this method isn't used.
I am facing a problem where the sender is basically providing invalid e-mails, with the following:
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
The charset here is specified as iso-8859-1, however the HTML is actually in UTF-16, and there's a META tag to that effect inside the HTML itself. There's also a byte order mark (encoded to =FF=FE in quoted-printable) at the start of the message.
At the moment, when I examine the the MailMessage.BodyHtmlText String, it renders as having a null character (\0) after every printable character, plus two extra characters at the start of the message. This is presumably because MailBee has decoded the UTF-16 from quoted-printable, but then interpreted it as ISO-8859-1 when converting to the default output encoding (UTF-8?). As UTF-16 is two bytes whereas ISO-8859-1 is just one, this causes each character to double up with a NULL in the next 'character' (since none of the UTF-16 chars actually use the second byte in this case). The two extra characters at the start are presumably the byte order mark decoded as ISO-8859-1 chars.
Any thoughts? It seems as though something in MessageParserConfig should be able to handle this, but I'm drawing a blank.
|
Back to Top |
|
|
Igor AfterLogic Support
Joined: 24 June 2008 Location: United States
Online Status: Offline Posts: 6104
|
Posted: 24 February 2014 at 4:18am | IP Logged
|
|
|
Would it be possible to provide us with a sample mail message for examination? Please make sure it's saved as EML file, and submit it privately via HelpDesk.
--
Regards,
Igor, AfterLogic Support
|
Back to Top |
|
|
smigs Newbie
Joined: 24 February 2014 Location: United Kingdom
Online Status: Offline Posts: 3
|
Posted: 24 February 2014 at 5:48am | IP Logged
|
|
|
Done
|
Back to Top |
|
|
smigs Newbie
Joined: 24 February 2014 Location: United Kingdom
Online Status: Offline Posts: 3
|
Posted: 24 February 2014 at 8:14am | IP Logged
|
|
|
Looks like there's no way at the moment to sniff the correct charset from Meta tag ahead of time (other than going through the raw byte[] array of the MimePart yourself), but since all the faulty messages I was dealing with had the same problem, I was able to hardcode a correction:
Code:
private void FixWrongMimeBodyHeader(MailMessage original) {
TextBodyPart html = original.BodyParts.Html;
if (html.Charset == "iso-8859-1")
{
MimePart mime = html.AsMimePart;
original.BodyParts.Remove("text/html");
mime.Headers.Add("Content-Type", "text/html; charset=\"utf-16\"", true);
byte[] mimeArray = mime.GetRawData();
MimePart newMime = MimePart.Parse(mimeArray);
TextBodyPart newHtml = new TextBodyPart(newMime);
original.BodyParts.Add(newHtml);
}
} |
|
|
|
Back to Top |
|
|