This fixed the issue but broke PullDP and

Midwest Management Summit
Bulletproofing PullDP's
Kim Oppalfens & Todd Hemsell
Bulletproofing PullDP's
Kim Oppalfens & Todd Hemsell
#MMSMinnesota
#Session Hashtag / ?? here
Standard DP Characteristics
• Single Instance Storage
• Each file crosses the WAN once
• Can create a DP Group with
multiple packages and assign a
new server to it and each file
crosses the WAN once
• Schedule and Rate Limits
Single Instance Storage in Action
• With R2 logging was set to a lower level
• Enable Debug and Verbose logging to see behavior
Sending thread starting for Job: 10013, package: CEN0000F, Version: 1, Priority: 2, server: DP1.DEMO.PVT, DPPriority: 200
Performing preactions package CEN0000F, Distribution point DP1.DEMO.PVT
Sending content Content_602d1b97-5e8e-4211-ae82-9061c6271678.1 for package CEN0000F
Redistribute=0, Related=
Checking for existence of scite-3.1.0x64.msi in Content_602d1b97-5e8e-4211-ae82-9061c6271678.1.
Checked for existence of scite-3.1.0x64.msi in Content_602d1b97-5e8e-4211-ae82-9061c6271678.1. (1)
Not copying: scite-3.1.0x64.msi
Get the code: Single Instance Storage Bandwidth Savings
Select Sum(TotalFileSize_KB) AS [DuplicateDataInKB] --/ 1024 InMB -- / 1024 InGB
From(
SELECT CF.FileName,SUM(CF.FileSize) / 1024 AS [TotalFileSize_KB], CF.StorageHash
,Count(Distinct PkgID) [PackageCount]
FROM CI_ContentFiles CF
JOIN CI_ContentPackages CP ON CF.Content_ID = CP.Content_ID
Group By FileName,CF.FileSize,StorageHash
Having Count(Distinct PkgID) > 1
) AS FileData
Standard DP Settings
• DistMgr processes 3
packages at a time
• PkgXferMgr limit is Max
number of packages
multiplied by Maximum
threads per package
Standard DP Behavior (Desired)
• DistMgr processes package and hands it off to PkgXferMgr
• DistMgr is still “Processing” the package
• PkgXferMgr copies package and returns success to DistMgr
• DistMgr moves to next package
Standard DP Behavior (Actual)
• DistMgr processes package and
hands it off to PkgXferMgr
• DistMgr is still “Processing” the
package
• PkgXferMgr copies package to
available DP’s
• PkgXferMgr retries offline Dp’s for 50
hours
• PkgXferMgr uses all threads
• *DistMgr waits for completion before
moving to the next package*
• All package processing halts until
retries are exhausted
Microsoft’s Explanation (Kerim Hanif)
Program Manager in the Configuration Manager team responsible for content distribution.
• Full explanation:
http://blogs.technet.com/b/configmgrteam/archive/2013/06/06/introducing-the-pull-distribution-points.aspx
PullDP Stated Characteristics
• New feature with SP1
• According to Karim:
How a PullDP Works
•
•
•
•
•
•
•
•
•
DistMgr creates a snapshot and calculates the HASH
PkgXferMgr Sends a package info bundle to the PullDP
PullDP opens the XML and gets a list of content from the DPLocation DPUrl
PullDP component on the DP checks to see how many of the files are already
downloaded
PullDP component passes the list of files to DataTransferService
DTS creates BITS jobs to download the content
CCMEXEC gets the files from BITS download location and writes them to disk
SMSDPProv imports the content into the content library
PullDP creates status message and sends it to the Site Server
Get the code: PkgXferMgr Sends a package info bundle to the PullDP
SELECT [dbo].[fnGetPullDPXMLNotification]
(
'CEN0001B',--PackageID
1,-- PackageVersion
'PULLDP.DEMO.PVT',--DestSite
2,--Priority
'add',--Action
1,--AllowFallback
'O:SYG:BAD:P(A;;FA;;;BA)(A;OICIIO;GA;;;BA)(A;;0x1200a9;;;BU)(A;OICIIO;GXGR;;;BU)(A;;FA;;;BA)(A;OICIIO;GA;;;BA)', --SSDL
0,--ExpandShare
32780,--HashAlg
'3E0A0163665DA39C08FE71018370B008E625681C36B1F79564A105CA54D528D7', --HASH
''--Related
) AS Notification
PkgXferMgr Sends a package info bundle to the PullDP
Then begins to monitor progress
PullDP opens the XML and gets a list of content from the DPLocation DPUrl
PullDP component on the DP checks to see how many of the files are already downloaded
PullDP component passes the list of files to DataTransferService
DTS creates BITS jobs to download the content
CCMEXEC gets the files from BITS download location and writes them to disk
SMSDPProv imports the content into the content library
PullDP creates status message and sends it to the Site Server
PullDP’s Didn't quite work out the way I expected
• Distribution Manager Available Threads ( fixed)
• Downloads the same file multiple times
• Large packages and timeout Values
• Failed packages are already transferred
• Refresh failed package deletes the package
• Shared files get deleted/marked unavailable (Fixed after SP1)
• Faled to get write lock
• PullDP deletes itself on error
Downloads the same file multiple times
• Scenario / Steps to reproduce the issue
• Create a distribution point group
• Assign content to it with multiple identical files
• Assign to a PullDP
• What Happens
•
•
•
•
PkgXferMgr sends multiple xml files to PullDP at same time
PullDP processes all XML files at same time
None of the files are downloaded yet
PullDP adds all files to list of files to be downloaded
Large packages and timeout Values
Failed packages are already transferred
• Scenario / Steps to reproduce the issue
1.
2.
3.
4.
5.
6.
7.
8.
Assign a large package to a PullDP over a slow link or unreliable network
PkgXferMgr checks status of package every 5 minutes 100 times
Package download has not completed
PkgXferMgr marks transfer as failed
PullDP continues to download package and sends success status message
DistMgr ignores success status message
“Refresh” the content on the DP
Return to step 2
• What Happens
• Package never makes it to the DP and network stays utilized in infinite loop
Refresh failed packages
Shared files get deleted/marked unavailable (Fixed after SP1)
• Scenario / Steps to reproduce the issue
• Package status is not installed due to content validation or failed transfer
• “Refresh” the package on the DP
• PullDP marks each file as unavailable in WMI / Deletes the file on PullDP
• What Happens
• Every package that shares that file is now broken
• Download errors
• Hash errors
Faled to get write lock
• Scenario / Steps to reproduce the issue
•
•
•
•
Assign multiple content to PullDP or PullDP gets a backlog
PullDP downloads multiple files at the same time
CCMEXEC threads get exhausted
CCMEXEC cannot write the files to disk
• What Happens
• It just gets worse and worse as PullDP continues to downloads content and
CCMEXEC retries each file fales to write it to disk
• *Typo is intentional*
PullDP deletes itself on error
• Scenario / Steps to reproduce the issue
• PullDP has an error parsing the XML or processing the content
• PullDP tries to delete the bad file
• It actually deletes all of the files in sms_dp$\sms\bin
• What Happens
• No more PullDP!
Issue Summary, end result
• Entire content distribution system stops
• Cannot transfer content that takes more than 8 hours to download
• Cannot refresh content without breaking other content (Fixed after SP1)
• Distmgr gets stuck waiting for content to get to ALL DP’s before moving to
next content
• Cannot even get a package processed on the primary because all distmgr
threads are “stuck” waiting for PkgXferMger to finish
• Content that does get downloaded cannot write to disk
• Server will not acknowledge successful transfers
• PullDP ends up committing suicide and deleting itself
The Solution!
• Increase Distmgr thread limit
• Increase the Query interval and timeout values
• Change fnGetPullDPXMLNotification to always ADD, never refresh
• Create new setting “PullDP Number Of Active Jobs”
• Create script to replace deleted files or install R2 CU3
Increase Distmgr thread limit
• Only possible in the console after R2 Prior to R2 increase the value in the DB
-- Find the values
SELECT SD.SiteCode, SC.ComponentName, SCP.Name, SCP.Value3, SCP.ID,SCP.ComponentID FROM
SC_Component SC
JOIN SC_SiteDefinition SD ON SD.SiteNumber = SC.SiteNumber
JOIN SC_Component_Property SCP ON SCP.ComponentID = SC.ID
WHERE SC.ComponentName IN('SMS_LAN_SENDER','SMS_DISTRIBUTION_MANAGER') AND SD.SiteCode = 'CEN'
ANd SCP.NAME IN('Package Thread Limit','Thread Limit')
/*
UPDATE SC_Component_Property
SET Value3 = 50-- Can go over 50 but console will not open distmgr properties
Where ID = 72057594037928051-- Verify the ID
AND ComponentID = 72057594037927963 -- Verify component id
AND Name = 'Thread Limit'
UPDATE SC_Component_Property
SET Value3 = 25-- console limit of 999. Would not go over the number of DP's you have
Where ID = 72057594037928054-- Verify the ID
AND ComponentID = 72057594037927963-- Verify component id
AND Name = 'Package Thread Limit'
*/
Increase the Query interval and timeout values
• Only possible in the DB, PS, or script
-- Find the values
SELECT SD.SiteCode, SC.ComponentName, SCP.Name, SCP.Value3, SCP.ID,SCP.ComponentID FROM
SC_Component SC
JOIN SC_SiteDefinition SD ON SD.SiteNumber = SC.SiteNumber
JOIN SC_Component_Property SCP ON SCP.ComponentID = SC.ID
WHERE SC.ComponentName IN('SMS_LAN_SENDER','SMS_DISTRIBUTION_MANAGER') AND SD.SiteCode = 'CEN'
ANd SCP.NAME IN('PullDP Query Interval')
/*
UPDATE SC_Component_Property
SET Value3 = 20
Where ID = 72057594037928056-- Verify the ID
AND ComponentID = 72057594037927963 -- Verify component id
AND Name = 'PullDP Query Interval'
*/
Change fnGetPullDPXMLNotification to always ADD, never refresh
• Paste the following into the function right before the SELECT statement then execute
--This is a theoretical modification that will make the remote DP add content instead of forcing a refresh.
-- do not use this in the real world. ever. no matter what
Set @Action = 'Add'
Create new setting “PullDP Number Of Active Jobs”
• Run on the server with the SMSProvider
For ModifyOrAddCM12Property.vbs you will need to update this line: Property_SiteCode = "XYZ" to the site that the Pull DP is a part of, and,
also modify this part at the top : Class_ItemName = "SMS_DISTRIBUTION_MANAGER|PrimaryFQDN" *Change PrimaryFQDN to the
SMSProviderServer
Then run the script with these arguments, ModifyOrAddCM12Property.vbs <smsproviderserver> <sitecode>
You will see this in the PkgXferMgr.log on the site server when it works:
PullDP capacity : 10
To get the actual property change you need a policy retrieval cycle form the PullDps so with either, just wait, update machine policy
manually on the PullDP side, or restart CCMExec.
For PullDps that are already in state, however, we need to clear the jobs, and, for this you will need another script (ResetPullDPState.vbs).
Assuming the Pull DP is in a DP Group for targeting, you could run the ResetPullDPState script then remove it from the group and add it back
to retarget the prestaged content. You can run this script locally on the PDP or specify the computer name if you want to run it remotely.
Create script to replace deleted files or install R2 CU3
• Upgrade to R2 CU3. CU3 contains the fix
• If that is not possible…
•
•
•
•
•
Copy the SMS folder from a known good PullDP to a package source
Delete all log files from bin and the Logs directory
Copy attached script to package source
Create a package from the source
Create a task sequence with a single step to execute a script
• Check the box to disable 64 bit redirection
• Set the command line to C:\Windows\System32\Cscript.exe FixDP.vbs
• Reference the package with the PullDP files and script
*This fixed the issue but broke PullDP and DataTransferService logging on my systems*
The end result
• A highly decoupled content distribution system
• Content over crosses the WAN once
• Refreshing a package only adds the missing files
Evaluations