Conditions Database Oracle implementation Andrea Valassi (CERN IT-DB) Borrowing a lot from previous presentations by Emil Pilecki 1 LCG Conditions Database Workshop 08-Dec-2003 Overview • Oracle implementation developed by Emil Pilecki in IT-DB – Work started in early 2002 (initial interest mainly in LHCb) – Production release 0.4.1.6 in August 2002 • Implementation is essentially frozen on August 2002 status – Emil left the group in late 2002 – No production users before the Harp migration • Harp condition data migrated from Objy in November 2003 – Only minor ad-hoc changes (by A.V.) with respect to Emil’s 0.4.1.6 version – Actual data migration using tools derived from Emil’s migration tools • Emil’s 0.4.1.6 version ready to be re-released for LCG – Ported to SCRAM and LCG CVS repository 2 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Implementation choices • Oracle 9i server – At CERN: devdb9 (development) or pdb01 (production) • Relational data model – Oracle 9i object features not used • Client access through the OCCI library – More user-friendly and better suited for C++ than OCI – OCCI implementation transparent for users • Some performance optimization for read access (queries) – Data insertion not optimized yet 3 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Why relational data model? • Data model is simpler • Sufficient for condition data • Well known and reliable • Less storage overhead • Less client-side processing 4 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Relational Design (ERD): folder(set)s, objects, data Possible data relation Folder_set # folder_set_id * name Condition_data Necessary data relation # data_id One to many relation o data_value o description o attributes Foreign key is a part of primary key for that table r parent_set_id Condition_object # Attribute is a part of primary key * Attribute cannot be null o Null value allowed for this attribute r Attribute is a foreign key u Attribute is a part of Unique constraint # object_id * since 5 Folder * till # folder_id * insertion_time * name * layer o description o description o attributes r data_id r parent_set_id r folder_id Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Relational Design (ERD): tags Tag Folder_set # folder_set_id * name # tag_id Object_tag u name * assignment_time * creation_time #r tag_id o description #r object_id o description o attributes r parent_set_id Condition_object # object_id Folder # folder_id * name 6 Folder_tag * since * assignment_time * till #r tag_id * insertion_time #r folder_id * layer o description o description o attributes r folder_id r parent_set_id r data_id Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Use of materialized views • Materialized views for data that is frequently accessed: Folder_paths Folder_sets_paths * full_path * full_path * folder_id – Full folder and folder set paths • Built from hierarchical queries * folder_set_id * parent_fs_id – Current HEAD of each folder Heads * object_id • To simplify computation of overlapping intervals on inserting • To speed up read access to the HEAD intervals * since * till * insertion_time * layer * parent_folder_id 7 Andrea Valassi IT-DB • Limitations – Update operations are auto-committed – Rollback and bulk updates not possible Oracle implementation 08-Dec-2003 Partitioning • Partitioning by folder – The object and data tables have a separate partition for each folder • These tables are also hash-subpartitioned by object id and data id • Advantages – Performance enhancements for large databases • Limitations – The partitioning schema is hardcoded and the same for all folders • Too many partitions for simple folders with few rows • Large storage overhead (each partition is a segment of at least 64kb) – No partitioning by time range yet 8 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Other implementation features • User-defined indexes for main columns – Explicitly used in queries through optimizer hints • Stored PL/SQL procedures – Increase server-side processing and reduce network traffic – Most obvious example: computation of overlapping intervals on insertion 9 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Comments from Harp migration • A few ad-hoc changes in the implementation – Specify separate data, index, BLOB tablespaces – Use tables and packages from the schema of a different user • Objy to Oracle migration via export/import to/from files – Standard export/import tools using API work but are not optimal – Modified import tool (breaking the API) used for Harp migration • “Clone” mode: keep the same insertion date and layer number • Bulk updates to increase insert speed by a factor 10 (to 600 rows/sec) • Some of the items on the to-do list – Reengineer data insertion (and m.view usage) to use bulk updates – Reengineer data retrieval for BLOB’s to use bulk reads – Keep track which objects are system-inserted 10 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003 Platforms • Linux RedHat 7.3 – Compiler gcc 2.95.2 and 2.96 – The OCCI libraries are not yet released for gcc3.2 • Sun Solaris 5.7 and 5.8 – Compiler: CC Sun WorkShop 6 C++ 5.2 • Windows 2000 – Microsoft Visual Studio 6.0 11 Andrea Valassi IT-DB Oracle implementation 08-Dec-2003
© Copyright 2025 Paperzz