Visualizations of HighDimensional Space Abstract Spatial data analyzing occurs at different levels – Pixel – Pixel-group – Band Pixel level – common attributes are – Location – coordinate plane or geolocation – Reflectance values – 0-255 encoded in 8 bits Abstract 2 • Band operations defined as functionals • Visualize process using 1D and 2D representors – Jewell diagrams – Augmented Himalayan chain diagrams Organization of Presentation • • • • • Massive data sets description Image band formats Band operations as functionals 1D and 2d representors Conclusions RSI Datasets • Massive – one Thematic Mapper image in time covers approximately a 180-km square area and consists of well over one billion pixels • Data repositories will soon reach petabyte size Scalability Issues • Cardinality • Row (or database size) scalability • Dimensionality • Column (or dimension) scalability Scalability Issues 2 • Spatial data scalability issues addressed by using functionals and visualizing the process using Jewell diagrams and mountain chain diagrams Image Band Formats • Existing formats – BIL (band interleaved by line) – BIP (band interleaved by pixel) – BSQ (band sequential) • New format – bSQ (bit sequential) Image Data Organized by Bands • First level of data organization is to group by bands • This figure represents mechanism used for separating image data into bands BAND 1 (blue) BAND 2 (green) BAND 3 (red) BAND 2 BAND 1 B G R BAND 3 SCENE DATA Spatial Data Formats 254 (1111 1110) BAND-1 127 (0111 1111) 37 (0010 0101) BAND-2 240 (1111 0000) 14 (0000 1110) 193 (1100 0001) 200 (1100 1000) 19 (0001 0011) BSQ format (2 files) BIL format (1 file) BIP format (1 file) Band 1: 254 127 14 193 Band 2: 37 240 200 19 254 127 37 240 14 193 200 19 254 37 127 240 14 200 193 19 bSQ format (16 files) B11 B12 B13 B14 B15 1 1 1 1 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 B16 B17 B18 B21 B22 B23 1 1 0 0 0 1 1 1 1 1 1 1 1 1 0 1 1 0 0 0 1 0 0 0 B24 B25 B26 0 0 1 1 0 0 0 1 0 1 0 0 B27 0 0 0 1 B28 1 0 0 1 BIP (Band Interleaved by Pixel) • Pixel-consecutive scheme • Data stored in pixelmajor order DIGITIZED AND FORMATTED DATA 0 1 4 3 2 0 2 4 2 4 0 3 .... BAND 1 0 1 2 3 4 ANALOG TO DIGITAL SCALE BAND 2 BAND 3 SCENE DATA BIL (Band Interleaved by Line) • Image scan line constitutes organizing base • Data stored in linemajor order DIGITIZED AND FORMATTED DATA line 1 band 2 line 1 band 1 line 1 band 3 0 3 2 4 .... 1 2 4 0 .... 4 0 1 3 .... BAND 1 0 1 2 3 4 ANALOG TO DIGITAL SCALE BAND 2 line 2 band 1 1 0 0 .. BAND 3 SCENE DATA BSQ (Band Sequential Format) • Data stored in bandmajor order • Widely used format • Each image band appears consecutively in data file DIGITIZED AND FORMATTED DATA 0 3 2 4 .. 1 0 0 .. 1 2 4 0 .. 2 4 3 .. BAND 1 0 1 2 3 4 ANALOG TO DIGITAL SCALE BAND 2 4 0 1 3 .. BAND 3 SCENE DATA bSQ (Bit Sequential Format) • • Split each band into eight separate files, one for each bit position. Reasons of using bSQ format – Different bits contribute to the value differently. – bSQ format facilitates representation of a precision hierarchy (from 1 to 8 bit precision). – bSQ format facilitates creation of an efficient data structure, the P-tree, algebra and cube BSQ and bSQ BSQ and bSQ are “tabular” formats. – BSQ consist of a separate table for each feature band. – bSQ consist of a separate table for each bit of each band. One can view it this way: – The data set is initially 1 relation or table, R(K1,..,Kk, a1, …, an) where k1,..,Kk are structure attributes and Ai are feature attributes. • Structure attributes of a 2-D image are X,Y coordinates of the pixels (rows). • Feature attributes are the bands, B,G,R, NIR, … • BSQ we separate each feature into a separate file and suppress the structure attributes altogether (assuming pixels are always arranged in raster order. (aka: decomposition storage model (DSM), Copeland et al, SIGMOD85, 268279.). • bSQ, separate each bit of each feature into separate file (raster order assumption) (aka: bit transpose file (BTF) model, Wong et al, VLDB85, pp 448-457.). Band Operations as Functionals 1 • 16-pixel reduced number, raster-ordered RSI dataset • Each pixel has – two structural attributes, x and y – three feature attributes, R (red), B (blue), G (green) – derived attributes, RVI, NTV, and RLTV RSI Band Functionals x y R G B Y 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 2 3 2 2 3 2 7 7 1 0 0 0 1 2 1 3 6 6 5 5 1 1 1 1 5 6 7 7 7 7 6 5 1 1 1 1 2 2 0 0 0 0 0 0 0 0 0 0 2 2 1 1 0 0 0 0 1 2 2 2 2 1 2 1 36 2.25 76 4.75 8 0.5 19 1.875 S = = 2-6+7 3-6+7 2-5+7 2-5+7 3-1+7 2-1+7 7-1+7 7-1+7 1-5+7 0-6+7 0-7+7 0-7+7 0-7+7 2-7+7 1-6+7 3-5+7 = = = = = = = = = = = = = = = = RVI NTV RLTV 3 4 4 4 9 8 13 13 3 1 0 0 1 2 2 5 30 38 6 6 270 262 590 590 30 110 166 166 110 86 54 14 1 2 1 1 2 2 3 3 1 2 2 2 2 2 2 1 Contours • Definition: let q Rk be the set of all points x Rn such that f(x) = q is the preimage of q under f and is denoted as f -1(q). • Now let [p,q] Rk be the set of all points x Rn , such that f(x) [p,q] is the preimage of [p,q] under f, or the contour of [p,q] under f. Contours Around a Given Pixel RVI (rough vegetative index) contours are – M RVIxy-1 – H RVIxy-1 • Using contours, functional pruning can prune-off nonneighboring pixels 1D and 2D Tuple Visualizations • RSI dataset of Figure 4 as a function, X, as follows: let f:X Rn Rk be any function, with R = reals. • If k = 1, then we call it a functional • If k = 2 or 3, then we call it a diagram and its range can be viewed as a plot of points, as in the Jewell and Mountain Chain diagrams. These are related to Parallel Coordinates • If k = n, then it is a vector field. Diagrams ( k = 2 or k = 3) • Attributes – represented by A1 to A8 – depicted by straight lines • Data points – represented by different colors – individual values depicted by colored dots – values scaled on an attribute-line Parallel Coordinates Jewel Diagram Jewell Diagram 2 Jewell Diagram 3 AUGH (Augmented Himalayan Chain) 1 AUGH 2 AUGH 3 Comparisons • Jewell diagram • AUGH diagram Conclusions 1 • Comparison of diagrams Conclusions 2 • Diagrams and Scalability – Provide a method for viewing n-dimensional data – Provide preliminary and rough interpretation of clustering and outlier detection – Might be useful in pruning dataset, addressing scalability issues, and identifying outliers Conclusions 3 • These are VERY preliminary results and further work with full datasets is necessary before the advantage of their use can be fully understood
© Copyright 2026 Paperzz