Script to Compare List of Email Addresses in Exchange


I was provided a list of email addresses of employees in a system that doesn’t interface with AD/Exchange and asked to validate those email addresses exist within our Exchange server. I figured the easiest way was to just script it (the list was close to 1000 people).  Total run time to compare the list was just a few seconds.

Here’s the (quite simplistic) script to accomplish the task.


$logFile = 'c:\scripts\IsAccountValid.log'

# Uncomment the entry below if not running from the EMS
# Add-PSSnapin Microsoft.Exchange.Management.PowerShell.Admin

# Import the email addresses from text file
$Import=Get-Content "c:\Scripts\emailaddresses.txt"

ForEach ($address in $import) {
     $valid = get-mailbox -an $address
          If ($valid) {
          "$address is Valid" >> $logFile
          } else {
          "$address is Not Valid" >> $logFile
          }
}
Advertisements

WSUS 3.0 SP2 will not run after installing update 2720211


Let me clarify that post subject up there. KB2720211 FAILED to install on my server and at the same time absolutely FUBAR’d the entire thing to the point where Update Services wouldn’t even start. I had our lead programmer stop by this morning and tell me that his computer kept complaining about not being able to download updates… (Note to self — add that into Nagios as I appear to have forgotten).

Anway, what a mess. I logged onto the server and along with the MMC not opening, and being greeted with an IIS worker crash warning the event log was chock full of errors:

Event ID 1011: A process serving application pool ‘WsusPool’ suffered a fatal communication error with the World Wide Web Publishing Service. The process id was ‘7540’. The data field contains the error number.
Event ID 7031: The Update Services service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 300000 milliseconds: Restart the service.
Event ID 18456: Login failed for user ‘NT AUTHORITY\NETWORK SERVICE’. [CLIENT: <named pipe>]
Event ID 7032: The WSUS administration console was unable to connect to the WSUS Server via the remote API.

I’m sure you get the point…

Anyway, it appears this was a fairly widespread issue (great QA work MS). I found one solution that worked for me – copied here (just in case):
Continue reading

Bulk Migrate User Home Directory in Active Directory


This summer I’m decommissioning an old file server that is no longer under warranty, and we have plans for it along with the iSCSI SAN it’s connected to. Slight problem though – even though I’ve configured our user shares to replicate around using DFS, for years the user creation script that was used for creating accounts hard coded the user home directory to this file server. With thousands of accounts – I’m not about to go and do all this by hand.

Enter Quest ActiveRoles Management Shell (free). Sure I could have used other tools – but I’ve got this one and I like it. Anyway, here’s the script I wrote to do this. Note:  It is likely not perfect, maybe not the most efficient way to handle this (definitely a bit slow), but it works and was ran against a few thousand user accounts.
Continue reading

Monday Sucks: DPM BMR Restore of Hyper-V Cluster Nodes


First of all – let me tell you that I did not get to this point lightly. As a matter of fact it was quite honestly the LAST damned place I wanted to be. So how did I get here?

A while back Dell, who we purchased our EqualLogic (EQL for short) units from, contacted us about apparently requiring some annual maintenance as part of our service contract. So they email you up out of the blue and basically tell you that you have 30 days to schedule a time with an engineer who will assist you in updating your firmware on your EQL units. Oh and by the way, they want you to update your Host Integration Tools (HIT) Kit on all iSCSI connected servers as well at the same time. So, I went about collecting the information they wanted and getting some MX time setup for a weekend. I understand (and completely agree with) updating firmware and software that fixes big bugs so I don’t really blame Dell for wanting this done and quite honestly once a year is not too big of a deal.

So, the MX day comes along and all goes pretty well aside from the fact that it took about 12 hours to get the whole thing done because of the EQL firmware, switch firmware and the long list of servers that needed to be done, not to mention that (per Dell) HIT kit 3.5.1 needs to be uninstalled before moving to 4.0. Long day. However one little issue sprung up. We run Windows Core for our Hyper-V hosts. The tech who was running the show couldn’t find the documentation for how to remove 3.5.1 from Core – so the install of 4.0 was run directly over 3.5.1. Turns out… that was a BAD BAD move. Why? Well my Core Hyper-V hosts are basically headless and configured to boot directly from iSCSI not from internal disks. After said 4.0 installation – the system for some reason continues to generate more and more iSCSI connections to the boot LUN until…. the server crashes, the VMs on said node failover (ungracefully) to another host and spin back up.  All told my hyper-v hosts crash once a day, at least.

H0w do I know Dell’s HIT Kit is to blame? The logs… it’s all in the logs. I went from a few Event ID 116 (Removal of device xxxx on iSCSI session xxxx was vetoed by xxxx) events a week to  200+ a day per host .. and the logs started going nuts right after the HIT Kit update reboot.

Note: Typically there’s about 40-50 per EQL Unit not 200 per unit. When Nagios starts sending me these alarms I’m guaranteed a host crash within minutes.

Anybody else see the pattern that starts to develop at some random time after boot? This could happen 12 hours after boot or 3 hours. No rhyme or reason. Just starts creating new connections to the LUN and never releases the old ones until the system goes down in flames.

Case opened with Dell, spent almost 4 hours on the phone while they had tech after tech look at it and come up with bubkis. At Dell’s recommendation I’ve uninstalled 4.0 and gone back to 3.5.1. I’ve gone through the registry bit by bit looking for residual 4.0 stuffs. I’ve spent all kinds of time on technet and google looking for anything I can find to try and solve the issue. No joy. I’ve sent a multiple DSETs, a couple Lasso files. Nada. The final straw this morning (summary of email conversation – I was much more polite than this):

Tech: Please send more DSETs so we can send to Engineers.

Me: WTF happened to the last ones I sent on Friday?! Nothing has changed except for more crashes.

Tech: Oh… there they are teehee … let me get them to the engineers.

Me: Like what should have happened two days ago?

… sounds of birds chirping as the emails stopped coming …

Anyway, part of this was my own stupid. I should have (and dammit I thought about it and then forgot to add it to my checklist) but I should have snapshot the boot LUNs prior to the MX. Whoops. Thankfully I was smart enough to make sure I had BMR backups being done on the Hyper-V hosts (and had tested it before actually trusting it).

The process for running a BMR restore is pretty simple – and even though these are clustered hosts the process remains pretty much the same (even from an iSCSI boot perspective).

The generic process can be found here:
http://blogs.technet.com/b/dpm/archive/2010/05/12/performing-a-bare-metal-restore-with-dpm-2010.aspx

The only gotchas with my setup were:

1.I had to transition all the VMs from one host to the other and put the host in MX mode (SCVMM).

2. I had to allow the iSCSI connections to start and then cancel booting from the LUN. I was then prompted to boot from CD. This way the Boot LUN was attached to the server and Windows Setup could see it. Obviously doing the restore setup from Admin Tools wasn’t an option on Core.

3. Since my BMR was apparently a little too old (2 weeks?!) I had to disjoin the host from the domain and rejoin it. Not a big deal and the cluster picked the host right back up as if nothing had changed.

4. Only do one host at a time (depends on your cluster’s ability to tolerate failure obviously).

Now I get to spend many hours monitoring every little thing to make sure the host stays stable and it doesn’t go LUN happy again. Needless to say HIT Kit 4.0 is no longer on my ‘update’ list. Here’s hoping this fixes it…

UPDATE: No Go. This did not fix the issue. At this point I’m at a loss. Going back to a pre-install BMR recovery and still having the same problem — Which did NOT exist when the BMR backup was taken. Temporary solution now is I’m migrating one host to use physical disks inside the chassis (post on how I’m doing that upcoming) – so at least one host is stable and can host the critical VMs along with the ones that can’t handle spontaneous reboots very well (like SQL).

VHD Parent Locator Fixup Failed – DPM 2010 ID 3130


Error:
DPM failed to fixup VHD parent locator post backup for Microsoft Hyper-V \Backup Using Child Partition Snapshot\SERVER(SERVER) on DPMSERVER. Your recovery point is valid, but you will be unable to perform item-level recoveries using this recovery point. (ID 3130)

So I’ve seen this happen a couple times now on my DPM 2010 installation. I wasn’t too overly worried about it because it was an occasional alert and usually the next set of recovery points would be fine. However, after a couple VMs started having the same recovery point error on a near nightly basis it was time to start figuring out what was going on.

I found this thread which contained a link to this KB article and that was the solution for the original poster. Unfortunately that KB didn’t really apply to me because all the hosts/vms in question are 2008 R2. But, what it did do is point me to the root of the problem itself. What I had done a while back was setup checkpoints on the systems in question and then a short while later (when I was done testing a couple things) I deleted all the previous checkpoints for those systems and kept only the current running point in time. The problem was I never took the downtime to merge the differencing disks and naturally I later forgot that I needed to. So, what would end up happening is I would have say 5 minutes of one of these VMs being off as I made configuration changes and Hyper-V would begin the merge process. However SCVMM didn’t show anything like that happening, so when I was done my config changes, I would boot the VM and Hyper-V would cancel the merge process.

So, in short the solution to this error is quite simply shut down your VM, open the Hyper-V console (not SCVMM) and wait while the merge process completes. Depending on the size of the differencing disks it may take a (long) while.

And for reference – the easiest way to determine if you have a disk merge that needs to be completed is to browse to your VM storage and check to see if there are AVHD files in there. You can also open the VM’s XML configuration and look for this line:
disk_merge_pending type=”bool” True