HEPII-Hideyuki_NAKAZ..

Grid Efforts in Belle
Hideyuki Nakazawa
(National Central University, Taiwan),
Belle Collaboration, KEK
3/27/2007
Grid Efforts in Belle
1
Out Line





3/27/2007
Belle experiment
Computing system in Belle
LCG at KEK and Belle VO status
Introduction of SRB
Summary
Grid Efforts in Belle
2
Belle Experiment
“B factory” experiment at KEK (Japan).
KEKB Accelerator
•Asymmetric e+e- collider
•3.5 GeV on 8 GeV
e  e    (4S )  B B
Mt. Tsukuba
KEKB
3km
Linac
3/27/2007
Belle
•3 km circumference
•22mrad Crossing Angle
•Continuous Injection
Belle Detector
•Generic purpose
•7 sub-detectors
Grid Efforts in Belle
3
Belle Collaboration
Aomori U.
BINP
Chiba U.
Chonnam Nat’l U.
U. of Cincinnati
Ewha Womans U.
Frankfurt U.
Gyeongsang Nat’l U.
U. of Hawaii
Hiroshima Tech.
IHEP, Beijing
IHEP, Moscow
IHEP, Vienna
ITEP
Kanagawa U.
KEK
Korea U.
Krakow Inst. of Nucl. Phys.
Kyoto U.
Kyungpook Nat’l U.
EPF Lausanne
Jozef Stefan Inst. / U. of
Ljubljana / U. of Maribor
U. of Melbourne
Nagoya U.
Nara Women’s U.
National Central U.
Nat’l Kaoshiung Normal
U.
National Taiwan U.
National United U.
Nihon Dental College
Niigata U.
Osaka U.
Osaka City U.
Panjab U.
Peking U.
U. of Pittsburgh
Princeton U.
Riken
Saga U.
USTC
Seoul National U.
Shinshu U.
Sungkyunkwan U.
U. of Sydney
Tata Institute
Toho U.
Tohoku U.
Tohuku Gakuin U.
U. of Tokyo
Tokyo Inst. of Tech.
Tokyo Metropolitan U.
Tokyo U. of Agri. and Tech.
Toyama Nat’l College
U. of Tsukuba
Utkal U.
VPI
Yonsei U.
13 countries, 57 institutes, ~400 collaborators
Lots of contribution from Taiwan
3/27/2007
Grid Efforts in Belle
4
Luminosity
Produce large amount of B mesons!!
1 fb-1~106 BB
Integrated Luminosity (fb-1)
Integrated Luminosity
3/27/2007
peak luminosity
1.7118 × 1034 cm-2s-1
1 fb-1 ~ 1TB / day
710 fb-1
●
Grid Efforts in Belle
Crab Cavity installed,
being tuned now.
Luminosity doubled?
5
History of Belle computing system
Performance
19974 years
20015 years
20066 years
Computing Server
[SPECint2000 rate]
~100
(WS)
~1,250
(WS+PC)
~42,500
(PC)
Disk Capacity
[TB]
~4
~9
1000
Tape Library Capacity
[TB]
160
620
3,500
Work Group Server
[# of hosts]
3+(9)
11
80+16FS
User Workstation
[# of hosts]
25WS
+68X
23WS
+100PC
128PC
3/27/2007
Grid Efforts in Belle
6
Overview of the B Computer
Workgroup
Servers
reserved
for Grid
Storage
On-line
Reconstruction
Farm
3/27/2007
Computing
Servers
Grid Efforts in Belle
7
Belle System
Storage System (DISK): 1PB
Computing Server: ~42,500 SPECint2K
Storage System (HSM): 3.5PB
3/27/2007
Grid Efforts in Belle
8
Data Production at Belle
MC
online
reconstruction
farm
HSM
~ 1PB
Generation
and
Detector
Simulation
rawdata +
“DST” data
2THz
(to finish in production
2 months)
hadron 120TB
+ others
non-HSM
Loose Selection Criteria
“MDST” data
(four vector, PID info etc.)
Users' analyes
3/27/2007
2.5THz
(to finish in
6 months)
Grid Efforts in Belle
@500/fb
9
Why Grid in Belle?


No urgent requirement
Belle shifts to precise and exotic measurement



More MC statistics necessary for precise
measurement
New skim for exotic process
Lesson in de facto standard
Maybe we should
start considering
about Grid
3/27/2007
Grid Efforts in Belle
10
Grid Introduction Strategy



Strong support from KEK CRC
Starting with MC production and
accumulating experiences, gradually
shift to handle experimental data
Recruitment


3/27/2007
Some collaborators who have running LCG
are preparing to join the Belle VO
Experiencing Grid potential may change
Belle’s recognition?
Grid Efforts in Belle
11
LCG Deployment at KEK
JP-KEK-CRC-01



Since Nov. 2005.
Registered to GOC, in
operation as WLCG
Site Role:





practice for production
system JP-KEK-CRC-02.
test use among university groups in
Japan.





belle, apdg, g4med, ppj, dteam, ops
and ail



More stable services based on
KEK-1 experiences.
Resource and Component:

SL-3.0.5 w/ gLite-3.0 later
CPU: 14, Storage: ~1.5TB
FTS, FTA, RB, MON, BDII, LFC, CE, SE
Supporting VOs:
Since early 2006.
Registered to GOC, in
operation as WLCG
Site Role:

Resource and Component:


JP-KEK-CRC-02
SL or SLC w/ gLite-3.0 later
CPU: 48, Storage: ~1TB (w/o
HPSS)
Full components
Supporting VOs:

belle, apdg, g4med, atlasj, ppj,
ilc, dteam, ops and ail
Operation is supported by great efforts by
APROC members in ASGC.
3/27/2007
Grid Efforts in Belle
12
Belle VO


9 sites
Belle software are installed to 3 sites (KEK x2, ASGC)





3/27/2007
~60 CPUs
2TB storage
MC production ongoing
Installation manual ready
GFAL with Belle software
Grid Efforts in Belle
13
Total Number of Jobs at KEK in 2006
JP-KEK-CRC-01
JP-KEK-CRC-02
1,400
700
1,000
200
400
Belle
3/27/2007
Grid Efforts in Belle
Belle
14
Total CPU Time at KEK in 2006
(Normalized by 1kSI2K)
JP-KEK-CRC-01
JP-KEK-CRC-02
4,000
3,000
12,000
10,000
4,000
1,000 [hrs kSI2K]
Belle
3/27/2007
Grid Efforts in Belle
Belle
15
SRB-DSI
Logical Site Overview
130.87.104.0/22
KEK-DMZ
KEK Firewall
SuperSINET
Grid LAN
KEK-CC
KEK-1
130.87.208.0/22
WS
130.87.224.0/
21
KEK-2
202.13.197.0/24
$ scp output Belle:
SRB
172.22.28.0/2
4
MCAT
HSM
172.22.28.0/24
Local files
3/27/2007
$ scp input Grid:
CPUs
Grid Efforts in Belle
16
SRB Introduction Schedule
April
March
W1
W2
Construction Planning
 Grid
 Belle Operation
 Networking
 KEKCC/IBM
W3
W4
W5
W1
W2
W3
W4
Preparation
MCAT
SRB
FW
SRB-DSI
Test
Connection
Start Operation
3/27/2007
Grid Efforts in Belle
17
Belle Grid Deployment Future Plan

Federate with Japanese universities.






KEK hosts the Belle experiment and behaves as Tier-0.
Univ. with reasonable resources: full LCG (Tier-1)
Univ. without resources: UI
preliminary
The central services such
as VOMS, LFC and FTS are
provided by KEK.
KEK also covers web
Information and support
service.
Grid operation is cooperated with 1~2 staffs in
each full LCG site.
deploy in the future
JP-KEK-CRC-02
UI
University
UI
University
3/27/2007
design
UI
University
UI
University
UI
University
Grid Efforts in Belle
Tier-0
JP-KEK-CRC-03
UI
University
UI
University
Tier-1
UI
University
UI
University
18
Summary
Belle VO launched
 Belle software are installed to 3 sites


KEK sites are mainly used by Belle
MC production ongoing
 SRB is being introduced

3/27/2007
Grid Efforts in Belle
19
Additonal (Belle's) Resources
We now obtain high-performance computer system;
but we didn't suddenly switch to the “less expensive” system.
We have been testing such
system for several years.
20units/20TB
350TB disks
1.5PB tapes
934 CPUs
1000TB disks
3.5PB tapes
2280 CPUs
3/27/2007
Linux based PC clusters
● S-ATA disk based RAID
drives
● S-AIT tape drives
●
These resources have been essential
for Belle (production/analysis)
B computer
for comparison
Grid Efforts in Belle
20
Belle Grid Deployment Plan

We are planning a 2-phased deployment for BELLE
experiments.

Phase-1: BELLE user uses VO in JP-KEK-CRC-02 sharing
with other VOs.
 JP-KEK-CRC-02 consists of “Central Computing System”

maintained by IBM corporation.
Available resources:


Phase-2: Deployment of JP-KEK-CRC-03 as BELLE
Production System
 JP-KEK-CRC-03 uses a part of “B Factory Computer System”

resources.
Available resources (maximum estimation)


3/27/2007
CPU: 72 processors (opteron), SE: 200TB (with HPSS)
CPU: 2200 CPU, SE: 1PB (disk), 3.5 PB (HSM)
This system will be maintained by CRC and NetOne
corporation.
Grid Efforts in Belle
21
Computing Servers
DELL Power Edge 1855
Xeon 3.6GHz x2
Memory 1GB
● Made in Taiwan [Quanta]
● WG: 80 servers (for login)
Linux (RHEL)
● CS: 1128 servers
Linux (CentOS)
● total: 45662 SPEC CINT
2000 Rate.
equivalent to 8.7THz
●
1 enclosure = 10 nodes / 7U space
1 rack = 50 nodes
3/27/2007
CPU will be increased by x2.5
(i.e. to 110000 SPEC CINT
2000 Rate) in 2009.
Grid Efforts in Belle
22
Storage System (Disk)
Total 1PB
with 42 file servers
(1.5PB in 2009)
● SATAII 500GB disk
x ~2000
(~1.8 failure/day ?)
● 3 types of RAID
(to avoid problems)
● HSM = 370 TB
non-HSM = 630 TB
●
SystemWorks
MASTER RAID B1230
16drive/3U/8TB
(made in Taiwan)
ADTX ArrayMasStor LP
15drive/3U/7.5TB
Nexan SATA Beast
42drive/4U/21TB
3/27/2007
Grid Efforts in Belle
23
Storage System (Tape)
●
HSM: PetaSite (SONY)
3.5PB + 60drv + 13srv
● SAIT 500GB/volume
● 30MB/s drive
● Petaserve
●
●
Backup
90TB + 12drv + 3srv
● LTO3 400GB/volume
● NetVault
●
3/27/2007
Grid Efforts in Belle
24