Diagnosing the Unknown Alias problem in SharePoint

My frustraition with a product that is, in theory, action packed, but faultering in execution continues - this time it is incoming email. A well behaved mail gateway should return a non-delivery receipt if it is unable to deliver email, it's common courtesy. SharePoint, both Windows SharePoint Services and Office SharePoint Service, seems to take exception to this idea. As an IT guy it is the most frustraiting thing to see a module that was slapped together with little regard to proper protocol behaviour. My frustration started when email enabled lists just stopped capturing email randomly. I am not alone in this frustration either, there are clearly others yet not a peep from Microsoft:

The SPTimer service polls the SMTP drop folder on a regular basis to look for incoming messages. When the message is picked up it is read in and the alias appears to be cross referenced against an EmailEnabledLists table. I was tipped off to this by a post on the SharePoint newsgroup. If the logging verbosity is set and the alias is not found it will log an Unknown Alias error to both the event log and the trace logs:

Type:		Warning
Date:		2/26/2008
Time:		8:42:30 AM
Event:		6873
Source:		Windows SharePoint Services 3
Category:	E-Mail
User:		N/A
Description:
An error occurred while processing the incoming e-mail file
D:\SMTP\Drop\54f3fb0601c8787d0000024f.eml. The error was: Unknown alias..

Along with the following event:

Type:		Information
Date:		2/26/2008
Time:		8:42:30 AM
Event:		6871
Source:		Windows SharePoint Services 3
Category:	E-Mail
User:		N/A
Description:
The Incoming E-Mail service has completed a batch.  The elapsed time was 00:00:00.0156256.
The service processed 1 message(s) in total.


Errors occurred processing 1 message(s):

Message ID: <SERVERxiDd9D9DQVXwR100000a19@domain>

 The following aliases were unknown:
 mylist

After poking around in the database to see if I could find any hints as to why this functionality decided to stop all of a sudden. Running the following query on each content database gave me a dump of the email aliases that each list was configured against:

SELECT 	    tp_EmailAlias as EmailAlias,
            SiteId,
            tp_WebId AS WebId,
            tp_Id as ListId
FROM        AllLists
INNER JOIN  Webs ON AllLists.tp_WebId = Webs.Id
WHERE       tp_EmailAlias IS NOT NULL

Cross referencing the results with the EmailEnabledLists table I was able to find that there were some, no, a lot of missing rows. Something was deleting my email enabled lists. At this point I don't know what, but it smells like a bug. In the mean time I have created a script to re-populate the configuration database table based on the content databases.

SET NOCOUNT ON
IF OBJECT_ID('tempdb..#EmailEnabledLists') IS NOT NULL
BEGIN
   DROP TABLE #EmailEnabledLists
END

CREATE TABLE #EmailEnabledLists (
    Alias   NVARCHAR(128) NOT NULL,
    SiteId  UNIQUEIDENTIFIER NOT NULL,
    WebId   UNIQUEIDENTIFIER NOT NULL,
    ListId  UNIQUEIDENTIFIER NOT NULL)

-- TODO: Insert your SharePoint Content Database Names into the Set
DECLARE DatabaseCursor CURSOR FOR
    SELECT  [Name]
    FROM    master.[sys].databases
    WHERE   [State] = 0 AND -- State = Online
            [Name] IN ('SharePoint_ContentDatabase1', 'SharePoint_ContentDatabase2')
    ORDER BY [Name]
OPEN DatabaseCursor

DECLARE @DatabaseName VARCHAR(128)
FETCH NEXT FROM DatabaseCursor INTO @DatabaseName
WHILE @@FETCH_STATUS = 0
BEGIN

    DECLARE @Command NVARCHAR(4000)
    SET @Command =
        N'USE [' + @DatabaseName + '];' + 
        N'INSERT INTO #EmailEnabledLists (Alias, SiteId, WebId, ListId) ' +
        N'SELECT  ' +
        N'    tp_EmailAlias as Alias, ' +
        N'    SiteId, ' +
        N'    tp_WebId AS WebId, ' +
        N'    tp_Id as ListId ' +
        N'FROM ' +
        N'    AllLists ' +
        N'    INNER JOIN  ' +
        N'        Webs ' +
        N'    ON      AllLists.tp_WebId = Webs.Id ' +
        N'    WHERE   tp_EmailAlias IS NOT NULL'
    EXEC sp_executesql @Command
    
    FETCH NEXT FROM DatabaseCursor INTO @DatabaseName
END
CLOSE DatabaseCursor
DEALLOCATE DatabaseCursor

-- TODO: Insert your SharePoint Configuration Database Name
USE [SharePoint_Config]

DECLARE AliasCursor CURSOR FOR
    SELECT  [Alias]
    FROM    #EmailEnabledLists
    ORDER BY [Alias]
OPEN AliasCursor

DECLARE @AliasName NVARCHAR(128)
FETCH NEXT FROM AliasCursor INTO @AliasName
WHILE @@FETCH_STATUS = 0
BEGIN
    IF(NOT EXISTS (SELECT Alias FROM dbo.EmailEnabledLists WHERE Alias = @AliasName))
    BEGIN
        DECLARE @SiteId UNIQUEIDENTIFIER
        DECLARE @WebId UNIQUEIDENTIFIER
        DECLARE @ListId UNIQUEIDENTIFIER

        SELECT  @SiteId = SiteId,
                @WebId = WebId,
                @ListId = ListId
        FROM    #EmailEnabledLists

        PRINT 'Inserting ' + @AliasName
        INSERT  dbo.EmailEnabledLists
                (Alias, SiteId, WebId, ListId, [Deleted])
        VALUES  (@AliasName, @SiteId, @WebId, @ListId, 'FALSE')
    END

    FETCH NEXT FROM AliasCursor INTO @AliasName
END
CLOSE AliasCursor
DEALLOCATE AliasCursor

IF OBJECT_ID('tempdb..#EmailEnabledLists') IS NOT NULL
BEGIN
   DROP TABLE #EmailEnabledLists
END

Now the million dollar question - what is the root cause? If I had time I would setup a trigger on the EmailEnabledLists table to notify me when something is deleted to get an idea of time frame and then use SQL Profiler to watch that process to see what the calls looks like and where they are coming from. Better yet, maybe some folks from the SharePoint team can chime in and share some more to help us understand the quirks of a product struggling for stability in the face of mass adoption.

UPDATE (24-Mar-08): Linked to my TechNet forums post as well, but no further progress. Looking into the Microsoft.SharePoint.Administration.SPEmailEngine class with Reflector I cannot see any code to generate NDRs either.

Comments Subscribe to Post Comments Feed

Todd Klindt said:

Good investigation work.  I'm curious to see if we get an official response.  I can't reproduce the issue or I'd burn a PSS call trying to figure it out.

tk

Jay said:

I have had the same problem exactly. I find it ridiculous that the system just forgets the aliases. I also can't believe that WSS would not have a catch all for incoming email with no alias. The emails just disappear. That's a great policy. The whole thing has lead me to deliver messages to both WSS and the public folders WSS is meant to replace. Unbelievable!

John said:

Not sure if this is exactly the same thing, but I have a problem with MOSS publishing sites. We frequently have a case where our user in the Approvers group for a site collection doesn't get notified when a page has been Submitted For Approval. When we check the user's MySite profile, their Work E-mail field is blank. We have also seen the same scenario with the user who submitted the page for approval not receiving a confirmation email when the page gets approved.

Right now our fix is to remove and re-add the user to the Approvers group.

John

Peter Seale said:

Thanks for taking the time to gather all this up. We're experiencing the same problem and have opened a case; I'm going to forward them to this page to summarily describe what is happening. I'll let you know how this works out.

Patrick said:

Colin,

   Glad i read your blog.I have done i could to configure WSS list o receive incoming mails but NO SUCCESS.

I keep seeing this in WSS LOG:

05/02/2008 14:01:42.09 OWSTIMER.EXE (0x1324)                   0x1300 Windows SharePoint Services   E-Mail                         6871 Information The Incoming E-Mail service has completed a batch.  The elapsed time was 00:00:00.  The service processed 0 message(s) in total.

I can see my emails in my drop folder for some seconds and then  dissapears..

MS Share point  Team where are you??

If you have any solutions for this please email me to : naijacoder@hotmail.com

Thanks Colin

Kurt said:

Great work here.  I used the script above to re-insert the mysteriously missing entries from EmailEnabledLists.  However, when I recycled the timer, and it processed the very next one, the entry was once-again nuked from the EmailEnabledLists table.  This is bad...

Any traction anywhere on this???

m@ said:

I just wanted to pass on  that we are experiencing the same issue (Random and usually discovered by the user community)

Our work around has been un-checking and re-checking the “Overwrite files with the same name” flag in the non-working library (Email Settings).

We are not sure why this flag reset gets the mail “receiving” going again for that library but it does.

Any additional information on the culprit would grateful.

Great blog BTW!

Geoff said:

Great diagnostic work. This is exactly what has just happened to us. There seems to be little available from Microsoft about this so I'm gonna log a ticket. THey can't ignore us forever, right?

Karl said:

Hi guys,

Great post! I have discovered the exact same thing as you and I was more than please to see that it was a kind of "known issue". I have tried to populate the EmailEnabledLists table as well ans dame thing.. after the first try, my new entries are gone! I have open a ticket with Microsoft releated to this problem and they seem to be aware of thie situation. On our side this problem beggin when we have move our Content Databases into a new SQL server. They should keep me in touch when they will have a work around.

WSS said:

This command should resolve the issue, without having to update the config database directly.

technet.microsoft.com/.../cc262558.aspx

Peter Seale said:

We had this issue again. Last time we submitted this to MS (non-premier) support they said it would be difficult to identify the cause, so I droppped the issue. Since the last episode, I have begun to tell my users that incoming email is unreliable and they will need to be okay with lost email.

Pretty rough, but it's better to tell them it doesn't work and have them be pleasantly surprised, than to allow them to rely on it, and have it fail to live up to expectations.

Brebinator said:

Hi guys,

Well my way around at least reducing the error is to create Exchnage Mail boxes that then do a mail flow to the contact that you create from Sharepoint. This way I at least ensure that all mail comes into an account and is not lost before it then is sent over to the libraries - or not as the case may be.

Also I have found an issue in that internal users cannot attach files that are sent to MOSS. This is due to a MAPI attribute not getting set up correctly on the AD Contact account.

Crazy crap... and not really enterprize... Governenace is crap as well. :-)

Have Your Say