Overview of Archival and Purge Process in IBM Sterling B2B Integrator - Bhavya M Reddy ([email protected]), Staff Software Engineer, IBM Sterling B2B Integrator L2 Support Table Of Contents Introduction to Sterling B2B Integrator...................................................................2 Importance of Database Maintenance......................................................................2 Business Process and Document lifespan Configuration........................................3 Lifespan calculation for the Business Process and the Document.................................................................................................................5 Business Processes Involved in Archival and Purge................................................8 Archival and Purge Process Flow depicted pictorially...........................................11 Life Cycle of a BP and a Document in Sterling B2B Integrator.................................................................................................................13 PurgeAll BP.............................................................................................................18 Related Links...........................................................................................................19 Introduction to Sterling B2B Integrator Sterling B2B Integrator (SBI) is a Transaction Engine that runs the processes we define and helps in managing these processes. It ties together applications, processes, data, and people, both within and outside your organization. It therefore helps in integrating businesses. The Sterling B2B Integrator approach to integration centers around business process management. A business process is a goal-driven, ordered flow of activities that accomplishes a business objective. Using Sterling B2B Integrator, you integrate the activities that make up your company's business processes. Common examples of such activities include XML, EDI, and proprietary file translation, transformation, and filtering Human interaction through a browser interface (such as reviewing and approving data) Content-based routing of messages Data publishing Extended process models that integrate the execution of a B2B protocol, such as AS2, with enterprise system integration, such as invoking the SAP adapter Importance of Data Maintenance Every activity here involves either a Business Process (BP) or a Business Document or a Business associated with a Business Document. These BPs and Documents have to be stored for future use. SBI maintains this data in the Database or the File System as per the Document Storage settings. During the course of the business activity in SBI, data starts building up in the database or the file system (if the document storage is File system). It is very important to clear this data at scheduled intervals to maintain the good health of the database. SBI has well-defined clean up processes as stated below which will help in clearing the data from the database ScheduleIndexBusinessProcess ScheduleBackupService SchedulePurgeService ScheduleAssociateBPsToDoc ScheduleRecoveryBusinessProcess ScheduleAutoTerminateService Schedule_BPLinkagePurgeService Business Process and Document lifespan The time period for which a BP and Document are available in SBI is determined by its lifespan. The lifespan can be configured in the SBI dashboard User Interface (UI) under Operations → Archive Manager → Archive Configurations-> Configure Archive settings In the screenshot below, the lifespan of the BP/Workflow is 2 days and the lifespan of the document (trackable business process information) associated with the BP is 1 day 12 hours. The same screen determines whether the BP and the associated document have to be archived and then purged, or directly purged. If the “Archived” option is chosen, the Backup Directory field for choosing the location to archive the documents has to be updated. When a BP is configured to choose “System Default”, the BP takes the lifespan set in the Archive Manager. Consider the MapTest BP shown in the screenshot below Click On source manager → edit → Traverse through the lifespan page to find out the lifespan chosen If for a particular BP the lifespan has to be different, then choose the “Process Specific” Option and set the lifespan accordingly. This will overwrite the default settings in the archive manager for this BP. According to the configuration in the screenshot below the MapTest BP will be available in the system for 2 days. The same screen also opens up for choosing the archival option that is Archive first and then purge or directly purge. Lifespan calculation for the BP and the Document SBI provides an option to enable Document Tracking, for which we set the trackable BP lifespan in the archive manager page. This option is available to ensure the document is available for an extended period of time. Document Tracking can be enabled at a global level for all the BPs by setting the tracking.global.enabled=true in the doc_tracking.properties. Document Tracking can also be enabled at a BP level for a specific BP by editing the BP and enabling the document tracking as below Now let us consider two scenarios for lifespan calculation 1. If Document Tracking is enabled, for the configuration below lifespan of the BP = 2 days lifespan of the Document = lifespan of the BP + lifespan of the Trackable Business Process Information = 2 days + 1 day 12 hours = 3 days and 12 hours 2. If Document Tracking is not enabled, for the configuration below lifespan of the BP = 2 days lifespan of the Document = lifespan of the BP = 2 days. Business Processes involved in clean up activity Schedule_IndexBusinessProcessService (BP moving service) BP name is ‘Schedule_IndexBusinessProcessService’ Runs every 10 minutes by default Index works only on BPs that are in Completed or Terminated state. That is to say Index works on BPs with ARCHIVE_FLAG=-1 Calculates the lifespan and Removal Method for the BP based on the Workflow Definition Removal Method is either 0 (Archive/backup) or 1,2 (Purge) Updates the records in ARCHIVE_INFO, setting the ARCHIVE_DATE equal to the lifespan, and the ARCHIVE_FLAG equal to the Removal Method from WF_INST_S For messages added to mailbox using FTP/SFTP or a mailbox add service into, the Index updates archive_info records with 10 years lifespan when the lifespan is reset following a mailbox delete If a BP fails Index, it is re-marked with an ARCHIVE_FLAG of -5. Schedule_BackupService (Archive) Runs every morning at 2:00 AM by default ‘Schedule_BackupService’ archives data that has an ARCHIVE_FLAG of 0 Uses the records in WF_INST_S to calculate the eligible data for the backup Once Archived, the ARCHIVE_FLAG is changed from 0 to 1 or 2 to indicate the BP is now ready for Purge The change to the ARCHIVE_FLAG is determined by the purge settings in the BP or in the Archive Manager in the UI. Schedule_PurgeService By default, Schedule_PurgeService BP runs every 10 minutes Deletes expired BP and Document data from the various tables Deletes documents from disk Uses the ARCHIVE_INFO table, and looks for BPs that have an ARCHIVE_FLAG of either 1 or 2, and an ARCHIVE_DATE of less than the current system date Deletes all eligible BP records, in batches. Schedule_AssociateBPsToDocs Very important housekeeping process Looks through the DOCUMENT, DOCUMENT_LIFESPAN and WORKFLOW_CONTEXT table for eligible workflow_id with 0,-1 IDs, and updates their BP ID (associates them) to its workflow_id For example, if the Schedule_AssociateBPsToDocs is set with a BP ID of ‘12345’, then these Documents with a BP ID of ‘0’,‘-1’ have their BP ID updated to ‘12345’ This process will flag records in the following tables: DOCUMENT, DOCUMENT_LIFESPAN, DOCUMENT_EXTENSION,TRANS_DATA,CORRELATION_SET. Schedule_BPRecovery If an SBI instance fails abnormally (JVM crashes or is killed via hardstop.sh), the WorkFlowEngine (WFE) doesn't have an opportunity to synchronize the database. Therefore Business Processes that are in an ACTIVE, HALTING or WAITING_ON_IO state will remain that way indefinitely (referred to as Active Hung processes), and the UI will not offer any actions to repair them (since operating on an in-flight BP is not safe). The BPRecover attempts to address the problem of how to synchronize the database and the WFE so as to not impact any newly executing BPs. The BPReportService obtains the list of ACTIVE, HALTING or WAITING_ON_IO from the database. This set is then compared to the list of threads, messages in the queues and ActivityData entries (objects in memory that can be associated with an in-flight process). This is done 3 times with a 10 sec sleep interval. If a candidate makes it through all 3 checks, then it is considered active hung. BP recovery level can be set individually for a BP in the Business Process manager. Schedule_AutoTerminateService The Auto Terminate service is pre-configured and, by default, is scheduled to run each day at 4:00 A.M. The service checks for business processes that have been in a specified state for a specified length of time and then terminates them. By default, the Auto Terminate service checks for and terminates business processes that have been in a halted state for over 14 days. You can adjust these settings to suit your specific business needs. Overriding the bprecovery.properties File Settings. The number of days a business process must be in a specified state before being terminated by the Auto Terminate service, and the specified state or states, are defined by properties in the bprecovery.properties file. The default settings are specified by the following lines: auto_terminate_days=14 num_states=1 auto_terminate_state1=halted auto_terminate_batch=1000 The default settings can by overridden using the customer_overrides.properties file. You can change the number of days before termination, change the specified state, or add additional states. The value of auto_terminate_days in the bprecovery.properties file can also be overridden using BPML in your business process using a statement in the following format: <assign to="AUTO_TERM_DAYS" >new_value</assign> Schedule_BPLinkagePurgeService Cleans the workflow_linkage table Workflow_linkage table contains parent-child BP information Runs once every night Needs to run more frequently on loaded systems Adjust the max BP if needed <assign to="max_business_processes">180000</assign> Archival and Purge Process Flow depicted pictorially Message enters SBI → BP Processes the Message → BP after completion gains ARCHIVE_FLAG=-1 The clean up Processes --Schedule_IndexBusinessProcessService, Schedule_BackupService and Schedule_PurgeService-- will act on the completed BP and remove it from SBI based on the lifespan set for it. Schedule_IndexBusinessProcessService BP is responsible for changing the ARCHIVE_FLAG from -1 to 0 or 1 and Schedule_BackupService BP is responsible for changing the ARCHIVE_FLAG from 0 to 2. Schedule_PurgeService BP looks for ARCHIVE_FLAG 1 and 2 records to purge them. BP With ARCHIVE_FLAG= -1 Index BP runs Index BP sets the ARCHIVE_FLAG to 0 or 1 Backup and Purge BP runs If ARCHIVE_FLAG =1 If ARCHIVE_FLAG=0 ARCHIVE_FLAG value? Purge Purge BP acts on ARCHIVE_FLAG 1 and 2 records and purges them If ARCHIVE_FLAG=2 Purge Backup BP Archives the Record, sets the ARCHIVE_FLAG to 2 and makes it eligible for Purge Back up data Archive data in the folder Provided in the Archive manager Life Cycle of a BP and a Document in SBI Based on the BP and Document lifespan as per the Archive manager configurations, the BP and document goes through different stages within SBI. Let us consider the example below in which the default document storage is Database and the lifespan of the BP is 1 day and the lifespan of the Trackable BP is 1 day and the Document Tracking is disabled. BP manager Configuration: lifespan Let us now check the process of archival for the BP StockQuoteBP which has completed execution. Screenshot of the BP Execution Select *from WORKFLOW_CONTEXT where WORKFLOW_ID='61618' This BP is also associated with a message Select *from DOCUMENT where WORKFLOW_ID='61618' Every BP after completing execution will get an ARCHIVE_FLAG=-1 Select *from ARCHIVE_INFO where WF_ID='61618' Now that the BP is eligible to be indexed, when the BP Schedule_IndexBusinessProcessService runs, it picks up this Workflow ID for indexing and makes it eligible for archival by setting ARCHIVE_FLAG=0 and also sets an ARCHIVE_DATE. Select *from ARCHIVE_INFO where WF_ID='61618' Now that the BP is eligible for archival, when the BP Schedule_BackupService runs, it archives this workflow and the message associated with it. It then sets the ARCHIVE_FLAG to 2 Select *from ARCHIVE_INFO where WF_ID='61618' Note: The archived documents get saved in the directory mentioned in the Archive Manager which will be available there forever until they are removed manually. SBI will not play any role in removing these documents from the file system. The BP and the message are now archived and are therefore eligible to be purged. As illustrated by the BP execution screenshot it is observed that the BP got executed on 2606-2015, as the BP and document lifespan is 1 day and as document tracking is not enabled. The BP lifespan= document lifespan= 1 day. Therefore the ARCHIVE_DATE is set to 27-07-2012, which means when the BP Schedule_PurgeService runs on the date specified, it purges this BP and the associated document from the system. After the Purge service BP completes its execution on the specified ARCHIVE_DATE, you can see that the BP and document have been removed from the system. The query below returns no results, which confirms the same thing. Select *from WORKFLOW_CONTEXT where WORKFLOW_ID='61618' Select *from ARCHIVE_INFO where WF_ID='61618' Select *from DOCUMENT where WORKFLOW_ID='61618' Searching for the BP ID in the UI returns null This confirms the document has been completely removed from the system. When the Document Storage is set to file system, the process above remains the same. But for the Documents and Work flows to be removed/purged from the file system physically, the parameters below have to be set in the archive_thread.properties file GENERATE_PURGE_DOCDISK_LIST=true PURGE_DOCS_ON_DISK=true PURGE_DOCDISK_LIST_FILENAME=/SBI_install_directory/documents/purge_dod_ list.txt When the documents are created a file is put in the documents directory. When (if) the process using the document is archived, a copy of the file goes to the arc_data folder. When purge runs, it creates a purge_dod_list of the files to be deleted from the documents directory. If PURGE_DOCS_ON_DISK is set to true, it consumes that file and deletes the files from the documents directory. The files in the arc_data directory are not deleted though. They have to be removed manually. Purge All Business Process Used to purge all the eligible records in the system The PurgeAll BP contains two flags – Purge (set to ALL) and Max Loops (set by default to 100). Max Loops value can be changed. The scheduled Purge BP and the PurgeAll BP cannot be run at the same time. CAUTION: The Purge All business process should not be used for ordinary production purposes. It is only for use, generally on the advice of IBM Support, to immediately remove data from the live system, regardless of its expiration date. This may be advisable, for example, if the Scheduled Purge business process has encountered some failure causing a back up of purge-eligible data. There is an additional flag (MAX_LOOPS) available that will help limit the number of loops made by the Purge All business process, thereby helping to control how much data the system will handle in a single execution. If a large amount of data had accumulated, this limit will help the system continue with other processing. There is also a PurgeAll script found in the bin directory. Running this script will truncate all the Transaction related data. Therefore it is not suggested to run this in the production system. CAUTION : Before running this command even on UAT or DEV customers are requested to open a PMR with IBM Support. Related Links : IBM Sterling B2B Integrator Documentation Home Page : http://www-01.ibm.com/support/knowledgecenter/SS3JSW/welcome IBM Sterling B2B Integrator – Understanding and Monitoring DB growth : http://www.ibm.com/support/docview.wss?uid=swg27044160&aid=1
© Copyright 2026 Paperzz