Author Archives: kartikkopalle

Load Balancing SCOM sdk service with Microsoft NLB Cluster

In SCOM, High Availability can be achieved at Management Server, Gateway Server and Agent level. If we drill down further more, we can even find ways to configure HA for SCOM console for uninterrupted monitoring

The sdk clients which connect to SCOM console can be made continuously available with Network Load Balancing. So, even in case of Management Server failure, the SCOM console will be operational

The SCOM Operations Console connections can be Highly available with Microsoft Network Load Balancing ( NLB ) , or using hardware Load Balancers or DNS aliases.

In this demo, I have chosen to use Microsoft Network Load Balancing

 

Prerequisites:

  1. Assign Static IP address instead of DHCP  to the SCOM Management Servers
  2. Microsoft Network Load Balancing feature to be enabled in the Management Servers
  3. Create NLB Cluster
  4. Add Nodes to the  Cluster
  5. Add cluster DNS-record to DNS zone

 

Primary Management Server: Server1.kartik.com

Secondary Management Server: Node2.kartik.com

Enable Microsoft Network Load Balancing Feature in both the Management Servers

1

2.PNG

3.PNG

4.PNG

5.PNG

After successful installation of NLB feature, open Network Load Balancing Manager from Administrative Tools and create NLB Cluster

6.PNG

Add Primary Management Server as Host

7.PNG

8.PNG

Give Cluster IP address

9.PNG

Give a Name to the Cluster

10.PNG

 

11.PNG

Connect Host to Cluster

Add Secondary Management Server to the Cluster

12.PNG

13.PNG

15

16

Add Cluster DNS records to DNS-Zone in DNS Server

Login to the DNS Server

Create “A” record for the Cluster

17.PNG

 

18.PNG

 

Access SCOM Console with NLB Name SCOMConsole.kartik.com

 

19.PNG

Now, we see that SCOM Console is operational with the NLB Cluster Name

20.PNG

Test the Functionality:

To test this functionality, I have stopped the Data Access Service in Secondary Management Server. This SDK Service is the core for accessing the SCOM console

SCOM console is connected to SCOMConsole.kartik.com

22.PNG

Let us stop the SDK Service

21.PNG

23.PNG

We can see that the SCOM Console is still operational

24.PNG


 

 

 

Advertisements

Installing Root Certificate Authority and Creating SCOM Template

System Center Operations Manager can manage the domain joined servers/machines using the default Kerberos protocol when the port 5723 is open. The machines which are not joined to the domain ( workgroup computers ) or the ones which are in a domain which doesn’t trust  Ops Manager can be managed by importing certificates in both Gateway/Management Server and the client machine

This blog features the configuration of Certificate Authority role and creating Certificate Template

CA Server : AD.kartik.com

Login to the Active Directory Server as a domain Admin and configure the CA role

Navigate to Server manager and select add roles and features

Select Active Directory Certificate Services role

1

Select Certificate Authority, Certificate Enrollment Web Service, Certification Authority Web Enrollment.

2

Specify credentials to configure AD CS Role

3.png

 

4.png

Select Enterprise CA

5.png

Specify the CA type as Root CA

 

6.png

Select the option create a new private key

7.png

 

Select the default options for Cryptiographic provider and Key Length and select SHA256 as hash algorithm

8.png

Specify the name to the Certificate Authority

9.png

Specify the validity period as per the Company Policy

10.png

Choose the  default database locations

11.png

verify the selected options

12.png

 

13.png

Configure the additional role services

14.png

Specify credentials to configure role services

15.png

 

16.png

Select the authentication type as windows integrated authentication

17.png

Specify the service account for CES

18.png

 

19.png

Select Certificate Authority from the Tools menu in Server Manager

20.png

Click on Certificate Templates and select Manage

22

Select the template Ipsec Offline request and select duplicate template

23.png

Leave the compatibility tab to default

Give the appropriate Template Name under general Tab

Select the validity period as per the Security Policy

24.png

Under Request Handling, check Allow Private Key to be exported

26.png

Under Cryptography Select as Providers Microsoft RSA SChannel Cryptographic Provider and Microsoft Enhanced Cryptographic Provider v 1.0

 

27

Navigate to Extensions tab and select Application Policies , click edit and select

28.png

Select Client Authentication and Server Authentication

29.png

Navigate to security tab, select Authenticated users and click on Add

30.png

Select Object types as computers

31.png

Search for  SCOM Management Servers

32.png

Grant Read and Enroll permissions to the Management Servers

33.png

Go back to the Certificate Authority Console, Select Certificate Template, Click on New Certificate Template to Issue

34.png

Select the Template which was created before

35.png

Launch https://ad/certsrv (https://adservername/Certsrv) from Management Server and select advanced certificate request

36.png

The certificate Template should be visible here

37.png

 

 

 

 

 

System Center Operations Manager 2016 High Availability – Configuration

High Availability is an important service for any application and it is highly recommended for a monitoring application. HA solution for a monitoring solution makes sure that the monitoring is always on and the service is available with out interruptions.

From System Center 2012, HA is made easier with the concept of Resource pool, where each member of the pool will synchronize the SQL data and make themselves available during a failure and the same principle applies in System Center 2016 too

Scenarios of HA in System Center Operations Manager

  1. Agent Server fail over to a Management Server from  Resource Pool
  2. Gateway Server Failover to Management Server
  3. Gateway Agent ( domain joined ) Failover
  4. Gateway Agent ( Work-group ) Failover

In order to test this fail-over functionality, I have configured the below servers in my Lab

  • Domain: Kartik.com
  • SCOM Primary Management Server : SCOM2016.kartik.com
  • SCOM Secondary Management Server: SCOM2.kartik.com
  • Gateway Server 1 : Server1.Kartik.com
  • Gateway Server 2 : Node2.kartik.com
  • Domain joined Client Server : Client2.kartik.com
  • Workgroup Computer : Client
  1. Agent Server fail-over to Management Server from a Resource Pool

In this scenario, the agent servers will be reporting to Management Server Resource pool and when one  Management server goes down, the agents reporting to that will fail-over to the other Management Server available in the pool

Test Fail-over

Scenario:

Primary Management Server: SCOM2.kartik.com

Failover Management Server : SCOM2016.kartik.com

Client Server: Client2.kartik.com

1.png

2.png

 

3

Shutdown the Management Server SCOM2.kartik.com to test the agent failover

11.png

SCOM2 showing grey in SCOM console

 

6.png

Event Logs from SCOM2016.kartik.com4

 

Logs from SCOM2016.kartik.com

5.png

Logs from SCOM2016.kartik.com

7.png

Logs from Client2.kartik.com

Here, we see that the server successfully failed over to SCOM2016.kartik.com

9

Client2.kartik.com showing healthy in SCOM console

10

 

2. Gateway Server Fail-over

Gateway Server: Server1.kartik.com

Primary Management Server: SCOM2.kartik.com

Failover Management Server: SCOM2016.kartik.com

 

13

  • Powershell Commands to configure Gateway Server failover

 

$primaryMS = Get-SCOMManagementServer –Name “SCOM2.kartik.com”

$failoverMS = Get-SCOMManagementServer –Name “SCOM2016.kartik.com”

$gatewayMS = Get-SCOMGatewayManagementServer –Name “Server1.kartik.com”

Set-SCOMParentManagementServer –Gateway $gatewayMS –PrimaryServer $primaryMS

Set-SCOMParentManagementServer –Gateway $gatewayMS –FailoverServer $failoverMS

14.png

Powershell Commands to verify Gateway Server Fail-over 

$GWs = Get-SCOMManagementServer | where {$_.IsGateway -eq $true}

$GWs | sort | foreach {

       Write-Host “”;

       “Gateway MS    :: ” + $_.Name;

       “–Primary MS  :: ” + ($_.GetPrimaryManagementServer()).ComputerName;

       $failoverServers = $_.getFailoverManagementServers();

       foreach ($managementServer in $failoverServers) {

              “–Failover MS :: ” + ($managementServer.ComputerName);

       }

}

Write-Host “”;

15.png

Verify Gateway Server Fail-Over

Shutdown the primary management Server SCOM2.kartik.com

Logs from SCOM2016.kartik.com

16.png

Event generated in SCOM console for SCOM2.kartik.com

17

Logs from Server1.kartik.com saying that it is successfully failed over to SCOM2016.kartik.com

18

Server1.kartik.com showing healthy in SCOM console

 

19

3. Gateway Agent ( domain-joined ) failover

Client: Client2.kartik.com

Primary Gateway Management Server: Server1.kartik.com

Failover Gateway Management Server: Node2.kartik.com 20

Client2.kartik.com reporting to Gateway Server1.kartik.com

21

 

Powershell commands to configure Gateway Agent failover

$primaryMS = Get-SCOMManagementServer | where {$_.Name –eq ‘server1.kartik.com’} 
$failoverMS = Get-SCOMManagementServer | where {$_.Name –eq ‘Node2.kartik.com’} 
$agent = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq ‘Server1.kartik.com’} 
Set-SCOMParentManagementServer -Agent: $agent -PrimaryServer: $primaryMS 
Set-SCOMParentManagementServer -Agent: $agent -FailoverServer: $failoverMS

22.png

Powershell commands to verify Gateway Agent failover

 

$Agents = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq ‘Server1.Kartik.COM’} 
$Agents | sort | foreach { 
Write-Host “”; 
“Agent :: ” + $_.Name; 
“–Primary MS :: ” + ($_.GetPrimaryManagementServer()).ComputerName; 
$failoverServers = $_.getFailoverManagementServers(); 
foreach ($managementServer in $failoverServers) { 
“–Failover MS :: ” + ($managementServer.ComputerName); 


Write-Host “”;

23.png

Shutdown Server1.kartik.com

Event generated in SCOM console for Server1.kartik.com

24.png

Event Log from Management Server SCOM2016.kartik.com

25.png

Client2.kartik.com successfully failed over to other gateway server Node2.kartik.com

Event log generated in Client2.kartik.com

26.png

Client2.kartik.com showing healthy in scom console

28.png

4. Gateway Agent ( workgroup ) failover

Workgroup computer: Client.kartik.com

Primary Gateway Management Server: Server1.kartik.com

Failover Gateway Management Server: Node2.kartik.com

27.PNG

Note: For the workgroup computer to failover , the certificate used for client authentication should be imported into personal store of failover Gateway Management Server too

Workgroup client reporting to the gateway Server1.kartik.com

31.png

Certificates imported in personal store of both the Gateway Servers Server1.kartik.com and Node2.kartik.com

29.png

Powershell commands to verify Gateway Agent failover

30.png

Shutdown Server1.kartik.com

32

Event logs generated from Management Server SCOM2016.kartik.com

33

Event Log generated in workgroup computer for successful failover

34

 

 

 

 

Powershell Script to clear cache on SCOM Agents

 

$path = “C:\GreyAgents.txt”

$srvlist = Get-Content “$path”

$serviceName = “HealthService”

Foreach ($srv in $srvlist)
{
Write-host “Greyagents” : “$srv”

Invoke-Command -ComputerName $srv -Scriptblock{ Stop-Service -ServiceName ‘HealthService’}

Invoke-Command -ComputerName $srv -Scriptblock{ Remove-item -path “C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State” -Recurse}

Start-sleep -Seconds 10

Invoke-Command -ComputerName $srv -Scriptblock{ Start-Service -ServiceName ‘HealthService’}

Write-host “Cleared Cache Successfully”

}

 

 

Powershell Script to schedule Maintenance Mode in SCOM

 

$path = “C:\SCOMMaintenanceMode.txt”
$domain = “kartik.com”

 

$MyFile = Get-content “$path”
$MyFile
foreach($srv in $MyFile)
{
Write-host “ServerName : $srv”

$startTime = [DateTime]::Now
$endTime = $startTime.AddMinutes(20)

$srv += “.$domain”

$Class = get-SCOMclass | where-object {$_.Name -eq “Microsoft.Windows.Computer”};
$Instance = Get-SCOMClassInstance -Class $Class | Where-Object {$_.Displayname -eq “$srv”};
Start-SCOMMaintenanceMode -Instance $Instance -Reason “PlannedOther” -EndTime $endTime -Comment “Scheduled SCOM Maintenance Window”

}

 

Powershell Script to recycle HealthService on all GreyAgents in SCOM

$path = “C:\GreyAgents.txt”

$srvlist = Get-Content “$path”

$serviceName = “HealthService”

Foreach ($srv in $srvlist)
{
Write-host “Greyagents” : “$srv”

 

Invoke-Command -ComputerName $srv -Scriptblock{ Stop-Service -ServiceName ‘HealthService’}

 

Start-sleep -Seconds 10

 

Invoke-Command -ComputerName $srv -Scriptblock{ Start-Service -ServiceName ‘HealthService’}

Write-host “Health Service ReStarted Successfully”

}

 

List out Grey agents in SCOM with Powershell

# Create a file for output

$file=”C:\Greyagents.txt”

$startdate =Get-date

$runtime =”$(Get-date -format “M/dd/yyyy H:MM”)”

$CurrentDate = $CurrentDate.ToString(‘MM-dd-yyyy_hh-mm-Ss’)

#get the SystemCenter Agent Class

$agent = Get-SCOMClass | where-object{$_.name -eq “microsoft.systemcenter.agent”}

#Get the grey agents

$objects = Get-SCOMMonitoringObject -class:$agent | where {$_.IsAvailable –eq $false}

forEach($object in $objects)
{

# display list of grey agents in PS window

write-host “Greyagents:$object”

#if you want output to Notepad, execute this

$object.displayname+”,”+$Object.HealthState| Out-file $file -append

# if you want output to csv, execute this

$object|Select Displayname,Healthstate | Export-Csv -Path “C:\Greyagents\Greyagents_$currentdate.csv”

}