IO TST600 hkvtlce599.orsypgroup.com X Started 09/17/2014 0934

Dollar Universe 6: Basic Health Check and
Troubleshooting
Dollar Universe Health Check
Dollar Universe Node Status
Dollar Universe services up or not
From GUI: IO server must be up:
If IO down:
From Dollar Universe CMD
[root@hkvtlce599 du]# . ./TST600/bin/unicheck
DUAS check procedure
Command : uxlst fnc fnc=* exp
FNC COMPANY NODE
--- ------- ---------------------------------------------------------------IO TST600 hkvtlce599.orsypgroup.com
IO TST600 hkvtlce599.orsypgroup.com
IO TST600 hkvtlce599.orsypgroup.com
IO TST600 hkvtlce599.orsypgroup.com
CDJ TST600 hkvtlce599.orsypgroup.com
CAL TST600 hkvtlce599.orsypgroup.com
LAN TST600 hkvtlce599.orsypgroup.com
EXC TST600 hkvtlce599.orsypgroup.com
SUR TST600 hkvtlce599.orsypgroup.com
BVS TST600 hkvtlce599.orsypgroup.com
ALM TST600 hkvtlce599.orsypgroup.com
DQM TST600 hkvtlce599.orsypgroup.com
EEP TST600 hkvtlce599.orsypgroup.com
SYN TST600 hkvtlce599.orsypgroup.com
SAP TST600 hkvtlce599.orsypgroup.com
OAP TST600 hkvtlce599.orsypgroup.com
GSI TST600 hkvtlce599.orsypgroup.com
[root@hkvtlce599 du]#
From Windows:
AREA
---X
A
I
S
X
X
X
X
X
X
X
X
X
X
STATUS
------Started
Started
Started
Started
Started
Started
Started
Started
Started
Started
Started
Started
Started
Started
Stopped
Stopped
Started
START
AT
STOP
AT
ACTIVE
AT
------------------------------------------------09/17/2014 0934
09/17/2014 0934
09/17/2014 0934
09/17/2014 0934
09/17/2014 0934
09/17/2014 0934
11/04/2014 1520
09/17/2014 0934
11/04/2014 1520
09/17/2014 0934
12/31/2036 2359
09/17/2014 0934
12/31/2036 2359
09/17/2014 0934
09/17/2014 0934
12/31/2036 2359
09/17/2014 0934
09/17/2014 0934
09/17/2014 0934
11/04/2014 1515
09/17/2014 0934
PID
CYCLE
--------------------4945
0
5114
0
5167
0
5301
0
5579
0
4945
0
4945
0
4945
120
4945
0
5655
0
4945
0
5020
0
5465
0
4945
0
0
0
0
0
5546
0
From Unix
[root@hkvtlce599 du]#
univa
4945
1
univa
5020
1
univa
5114
1
univa
5167
1
univa
5168
1
univa
5279
1
univa
5301
1
univa
5315
1
univa
5393
1
univa
5413
1
univa
5465
1
univa
5533
1
univa
5546
1
univa
5579
1
univa
5655
1
root
17346 16893
[root@hkvtlce599 du]#
ps -ef | grep TST600
0 Sep17 ?
07:49:51
0 Sep17 ?
00:01:10
0 Sep17 ?
00:04:11
0 Sep17 ?
00:03:42
0 Sep17 ?
00:00:00
0 Sep17 ?
00:00:00
0 Sep17 ?
00:04:01
0 Sep17 ?
00:00:00
0 Sep17 ?
00:00:00
0 Sep17 ?
00:00:00
0 Sep17 ?
00:00:23
0 Sep17 ?
00:00:00
0 Sep17 ?
00:00:13
0 Sep17 ?
00:00:33
0 Sep17 ?
00:00:09
0 15:14 pts/1
00:00:00
./uxioserv TST600 X hkvtlce599.orsypgroup.com
./uxdqmsrv TST600 X hkvtlce599.orsypgroup.com
./uxioserv TST600 A hkvtlce599.orsypgroup.com
./uxioserv TST600 I hkvtlce599.orsypgroup.com
./uxcdjsrv TST600 A hkvtlce599.orsypgroup.com
./uxbvssrv TST600 A hkvtlce599.orsypgroup.com
./uxioserv TST600 S hkvtlce599.orsypgroup.com
./uxcdjsrv TST600 I hkvtlce599.orsypgroup.com
./uxbvssrv TST600 I hkvtlce599.orsypgroup.com
./uxcdjsrv TST600 S hkvtlce599.orsypgroup.com
./uxeepsrv start TST600 X hkvtlce599.orsypgroup.com
./uxbvssrv TST600 S hkvtlce599.orsypgroup.com
./uxgsisrv start TST600 X hkvtlce599.orsypgroup.com
./uxcdjsrv TST600 X hkvtlce599.orsypgroup.com
./uxbvssrv TST600 X hkvtlce599.orsypgroup.com
grep TST600
Communication with Manamgent Server OK or not?
[root@hkvtlce599 du]# ./TST600/bin/unims -checkms
UVMS on hkvtlce599.orsypgroup.com/4184 reachable, version 6.2.01
[root@hkvtlce599 du]#
Does Dollar Universe receive updates from UVMS?
For logins:
[root@hkvtlce599 du]# ./TST600/bin/lstproxy
Proxy file: 28 definitions. Sync from hkvtlce599_MgtServer at: 10/31/2014 10:44:57
SYS [DOMAIN\]USER
HOSTNAME
-------------------------------------------------UVC univa
hkvtlce599.orsypgroup.com
UVC oper
hkvtlce599.orsypgroup.com
SYS orsyp\cau
HKLPMSUP04.orsypgroup.com
SYS orsyptst\dynadmin
hkstm2k804
UVC admin
hkvtlce599.orsypgroup.com
UVC univa
hkvtlce599.orsypgroup.com
SYS hkvtw2k303\administr *
SYS NT AUTHORITY\SYSTEM *
UVC tst560d
hkvtlce599.orsypgroup.com
GROUP
----------------------------Administrators
Operators
Operators
Operators
Administrators
PROFADM
PROFADM
PROFADM
PROFDEV
For alert:
[root@hkvtlce599 du]# ./TST600/bin/lstalert
Alert file: 1 definition. Sync from hkvtlce599_MgtServer at: 09/17/2014 09:36:26
(C)NAME
TYPE MON AREA MU
TASK
SESSION
UPROC
------------------------------------------ --- ----------------------------------------------------------------------------------------(M)SCN alert
UPR STA A X *
*
*
SCN_TEST
Can you do Monitoring? (Access to Job Run)
Check the CDJ
CDJ is in charge of Job Monitoring (Job Run)
Everything else working but not Monitoring?  Restart CDJ
Job not launching (all new job in Launch Wait)
Check:
-
Launcher must be started
Launcher Wakeup date not in the past or not in a far future (Wakeup date is the date for the next start of a job)
If Stopped or Wakeup date issue: restart launcher
Note: if license expired, launcher and calculator won’t start
Not New Job (not even in Launch Wait)
Check the calculator status (should be started)
Note: if license expired, launcher and calculator won’t start
If the new execution are in Pending
99% of the time, a Pending should last 1 or 2 seconds.
If longer:
-
Normal if too many jobs in the queue
Not normal if queue issue
Check the queues
If job is in Event Wait while condition should be there
Check History Trace
Get this line(s)
Check is you can find this line in the Job Event of the SAME machine:
Do you have event with the exact same property:
-
No: the event wait is normal (but the issue might be: why no event)
-
IF Event: not normal
IF this is a critical job and it must run now: a possible workaround is to bypass the condition
Another workaround is to create the event manually
Cross node execution: the destination node did not receive an event or execution order
Check the files
-
Data/exp/u_fecd60.dta
Data/exp/u_fecl60.dta
It should be empty. If not empty: there is/was a communication issue. You have to check on both side
Note: Solution/Workaround might be to reset those files
What to ask if user have a job execution issue
-
Job Log (if exist)
History Trace
The script
If you need to check if a user did something
Check audit-trail
Job Submission (Submission Account)
Submission Issue?
-
No Job log
You have history trace
Unix: the real user is defined in the Submission account:
The system user must exist
Windows: User Services
The Windows service must exist and be started:
On Windows… 1 more submission option: Interactive Jobs
For this job the Windows session of the user must be started