Mystery of the Server 2008 + IIS7 + OLE = "MDAC Not Installed" Error

Talk about barking up the wrong tree!

I just solved a very peculiar error regarding migrating old ASP.Net OLEDB Excel/Access file access code to a new 64-bit Windows Server 2008 under IIS7.  The first steps for migrating this application were obvious.  The application pool needs to run under 32-bit compatibility mode, the pipeline needs to be classic, etc.  But when we got to the Excel export functionality, all hell broke loose.  No matter what we did, we would get the following exception.

Exception of type 'System.Web.HttpUnhandledException' was thrown. [System.Web.HttpUnhandledException]
The .Net Framework Data Providers require Microsoft Data Access Components(MDAC). Please install Microsoft Data Access Components(MDAC) version 2.6 or later. [System.InvalidOperationException]
Retrieving the COM class factory for component with CLSID {2206CDB2-19C1-11D1-89E0-00C04FD7A829} failed due to the following error: 800703fa. [System.Runtime.InteropServices.COMException]

The error was a red herring.  The problem is not that MDAC is not installed – it is far more obscure.  WDAC (MDAC's successor) is included in Windows Server 2008 and has plenty of backward compatibility built into it.  That part of the equation is perfectly fine.  In fact, if you run console/forms apps, you probably will never see this problem.  It is only when you are running ASP.Net code and use a custom identity for the Application Pool (possibly some other situations too) that this error occurs.

By doing a ProcMon trace, I was able to pin down the problem.  If you follow all the steps that w3wp.exe performs, you will see one really weird line.

Line from ProcMon 

Okay, the unknown result code is weird enough, but ProcMon couldn't even tell me which registry key it couldn't find!  Examining the stack trace confirmed that this was the only request logged that had ole32.dll loaded, plus it showed ole32.dll throwing an error, so it really appeared to be related to my problem.

ProcMon stack trace

Thankfully, the Great Gazoogle did not let me down.  It turns out that 0xC0000425 indicates that an application was attempting to access a registry hive  after it was unloaded.  What the heck?

This is a good time to explain how the registry works at a very high (and simple) level.  Each registry hive is a self-contained database in an independent file in the file system.  A good example is NTUSER.DAT located in your user profile folder.  That's your very own HKCU.  Not all of the registry is always loaded all the time.  This is what allows Windows to move your profile around, or copy your profile (registry included) without having to do individual registry read/writes (much slower).

If HKLM or HKCR were unloaded while Windows was running, your computer would probably just freeze up completely, so those were not good candidates for this problem.  It had to be a user hive.  Since it was the application pool identity that got this error, it had to be that user's profile.  But how can its own registry get unloaded in the middle of execution?

I found the answer at a Sun support forum, of all places.  It turns out that in previous versions of Windows, you could download a program called UPHClean that would help unload user profiles when they were no longer needed (supposedly) to avoid the annoying user profile deadlocks that would sometimes occur, causing you to have to use a temporary profile, instead of your own user profile.  If you haven't experienced this error before, you are lucky!  It is rare, but when it occurs, it is very annoying.  It's apparently more common on Terminal Servers than other servers, from what I've seen.

Well, in Server 2008, the functionality of the UPHClean service has been rolled into the User Profile service, which always runs.  So Server 2008, on the whole, more aggressively shuts down dangling registry handles when a user logs off.  The aforementioned support forum participant recommended shutting off the UPHClean functionality entirely.  I'm not a big advocate of fixing shovel-sized problems with a back ho, so I decided to dig a little more.

To see if this really was the cause of my woes, I searched for Event ID 1530 in the event log to see a user's HKCU was ever forcefully unloaded.  Yes, there have been occurrences, yes the process involved was w3wp.exe, and yes the SID corresponds to the application pool identity.  I'm getting closer to a solution!

Windows detected your registry file is still in use by other applications or services. The file will be unloaded now. The applications or services that hold your registry file may not function properly afterwards. 

DETAIL -
2 user registry handles leaked from \Registry\User\S-1-5-21-1129910693-4165624395-2147873099-2135_Classes:
Process 6632 (\Device\HarddiskVolume1\Windows\SysWOW64\inetsrv\w3wp.exe) has opened key \REGISTRY\USER\S-1-5-21-1129910693-4165624395-2147873099-2135_CLASSES\Wow6432Node
Process 10144 (\Device\HarddiskVolume1\Windows\SysWOW64\inetsrv\w3wp.exe) has opened key \REGISTRY\USER\S-1-5-21-1129910693-4165624395-2147873099-2135_CLASSES\Wow6432Node

So why was this happening?  Clearly my application has not exited.  Why does the User Profile Service think it can arbitrarily close that registry hive while my application pool is still doing work?  I'm guessing that UPHClean detects the open handles and when enumerating loaded profiles, sees that the user's profile is not loaded, so decides that those open handles must be dangling and should be forcefully closed.

The solution to preventing this problem turns out to be a very simple configuration setting in IIS7's Application Pool Advanced Settings.  To fix this problem, all you need to do is enable profile loading in IIS.  What this does is tell IIS to completely load the user profile for the entirety of the execution of the application.  The default approach is the IIS6 approach, where the profile is not loaded when the user is impersonated by the IIS service.  In IIS6, this was okay because on Server 2003 UPHCLean was an optional download and not installed by default.  If it's not installed, it can't unload the hive, obviously.  :-) I would imagine if you did install it on 2003, you would have a similar problem in a similar situation.

IIS7 advanced settings

So, in conclusion, no profile loading means premature registry unloading, which means no OLE, no MDAC, and, therefore, no Excel/Access.

Mystery solved.