Overview
The following error is received when some attachments with certain Unicode characters fail to be indexed:
<date>,<time>,524,1,"#00000F70","#00000008","error ","IndexableAttachment","error: Failed to index attachment Doc11.doc: System.Text.EncoderFallbackException: Unable to translate Unicode character \uDEAD at index 359 to specified code page.; at System.Text.EncoderExceptionFallbackBuffer.Fallback(Char charUnknown, Int32 index); at System.Text.EncoderFallbackBuffer.InternalFallback(Char ch, Char*& chars); at System.Text.UTF8Encoding.GetBytes(Char* chars, Int32 charCount, Byte* bytes, Int32 byteCount, EncoderNLS baseEncoder); at System.Text.EncoderNLS.GetBytes(Char[] chars, Int32 charIndex, Int32 charCount, Byte[] bytes, Int32 byteIndex, Boolean flush); at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder); at MArc.Search.Core.IndexableAttachment.ProcessAttachment(WCFClient`1 storeEmailRetrievalProxy, Guid dbGuid, Int32 messageId, Int32 attachmentId, Attachment attachment, StreamWriter writer)
It will reference the unicode character (in this case it is \uDEAD) and the index (in this case it is 359). Both can differ. If this occurs it can leave a temp file behind in ..\MailArchiver\Search\Temp.
Environment
GFI Archiver Build 20130510
Resolution
Upgrade to 20130704 or newer or apply the patch attached in this article, MARC2013_PATCH_20130703_01.
Priyanka Bhotika
Comments