15.1 – INTRODUCTION TO PHYSICAL-QUERY-PLAN OPERATORS PRESENTED BY: JASON CHEE QUERY PROCESSOR • Query Processor: Group of components of a DBMS that turns user queries and datamodification commands into a sequence of database operations and executes those operations QUERY COMPILATION (CH 16) • Three parts: • Parsing: Construct parse tree • Query rewrite: parse tree -> query algebra -> logical query plan (faster) • Physical plan generation: Converts logical query plan to physical query plan by selecting appropriate algorithms and order of execution. PHYSICAL-QUERY-PLAN OPERATORS • Physical Operators often are implementations of relational algebra operators • Examples of non-relational operators: • Scan: bring into memory each tuple of some relation • Iterators: method by which operators comprising a physical query plan can pass requests for tuples and answers among themselves SCANNING TABLES • Reading the contents of a relation R • Table-scan: • Relation R is stored in secondary memory • Blocks containing tuples of R are known, and it is possible to get the blocks one by one • Index-scan • If there is an index on any attribute of R, we may be able to use this index to get all the tuples of R. SORTING WHILE SCANNING TABLES • Sort relation as we read tuples for multiple reasons. Examples: • ORDER BY clause • Operations requiring relations to be sorted • Physical-query-plan operator sort-scan can be implemented many ways. One example is a B-tree index on sorted attribute a. COMPUTATIONAL MODEL FOR PHYSICAL OPERATORS • Query is made of several operations of relational algebra, and query plan composed of several physical operators. • Estimate cost by number of disk I/O’s. • To compare algorithms, we assume that the arguments of any operator are found on disk, but the result of the operator is left in main memory. • Because size of result doesn’t depend on algorithm • Final write is cost of query, not algorithm PARAMETERS FOR MEASURING COSTS • M: Number of main memory buffers (size of block) available to operator. Could be smaller than total main memory if several operators share memory. • B or B(R): Size of relation R – number of blocks to hold all tuples of R • T or T(R): Number of tuples in R. • T/B = tuples per block • V(R,[a1,a2,…an]): number of distinct values in a column, or columns for multiple attributes I/O COST FOR SCAN OPERATORS • Table-scan: • If R is clustered, need B disk I/Os • If R is not clustered, could be up to T disk I/Os – as many blocks as there are tuples • Index-scan: • If column data is contained in the index • SELECT category_id FROM tbl WHERE category_id BETWEEN 10 AND 100; • Don’t need to access the table • Often smaller than B ITERATORS FOR IMPLEMENTATION OF PHYSICAL OPERATORS • • • • Design pattern to implement physical operators Three Methods 1) Open(): Initializes data structures 2) GetNext(): Returns the next tuple in the result and adjusts data structures as necessary. • If no more tuples, return not found • 3) Close(): Ends the iteration for all tuples. Calls close on any arguments of the operator. TABLE-SCAN ITERATOR METHODS THANK YOU • Please feel free to ask any questions.
© Copyright 2024 Paperzz