Vertica Projection structure

In Vertica every table needs one projection at least called superprojection, and like we said in the previuos tutorial is a optimized materialized view.

The superprojection will contain all the columns of the table. The superprojections are created by default on every CREATE TABLE and CREATE TEMPORARY TABLE statements.

The structure of a projection

To create a projection use the following sintax:

CREATE PROJECTION [ IF NOT EXISTS ]
...[[db-name.]schema.]projection-name
...[ ( { projection-column
...| { GROUPED ( column-reference1, column-reference2 [ ,... ])}
......... [ ACCESSRANK integer ]
......... [ ENCODING encoding-type (jump)] } [ ,... ] )
...]
AS SELECT table-column [ , ... ] FROM table-reference [ , ... ]
... [ WHERE join-predicate [ AND join-predicate ] ...
... [ ORDER BY table-column [ , ... ] ]
... [ hash-segmentation-clause
... | range-segmentation-clause
... | UNSEGMENTED { NODE node | ALL NODES } ]
... [ KSAFE [ k-num ] ]

 

The projection sintax breacks down into 4 parts:

  • Column list and encoding
  • It will contain the list of columns that are to be created in the projection and the type of encoding that will be applied on them.

  • Base query
  • It will contain the list of columns that are to be used with full table name reference.

  • Sort order
  • The sort order optimizes for a specific query or commonalities in a class of queries based on the query predicate.If you do not specify a sort order, Vertica uses the order in which columns are specified in the column definition as the projection's sort order.

  • Segmentation
  • It will determines whether a projection is segmented across nodes within the database. Segmentation maximizes database performance by distributing the load.

    Example of a projection:
      CREATE PROJECTION projection_one (
      col_1 ENCODING RLE ,
      col2 ENCODING RLE ,
      col3 )
    AS SELECT
      table_one.col_one,
      table_two.col_two1,
      table_two.col_two1
    FROM
      schema_one.table_one
      schema_one.table_two
    ORDER BY
      table_one.col_one,
      table_two.col_two1
    UNSEGMENTED ALL NODES;

    As you can see int this example we have created a projection called "projection_one" that will have 3 columns (col_1,col2,col3) that receive diferent types of encoding depending on their content, we also can see that as base query we are using the column(col_one.table_one,col_two1.table_two,col_two1.table_two) from tables(schema_one.table_one schema_one.table_two) ordering them by columns (table_one.col_one,table_two.col_two1) and we chose not unsegment(replicate) the projection on all nodes


    Now that we have seen how to create projection let's go thru the CREATE PROJECTION options

    Option of Nodes Required
    projection name Specifies the name of the projection to create
    projection-columns Specifies the name of one or more columns in the projection. The column data type is that of the corresponding column in the schema table
    encoding Specifies the type of encoding to use on the column. By default, the encoding type is auto
    ACCESSRANK is useful if you want to increase or decrease the speed at which a column is accessed.
    WHERE join-predicate Specifies foreign-key = primary-key equijoins between the large and smaller tables. No other predicates are allowed.
    ORDER BY table-column Specifies the columns to sort the projection on.
    hash-segmentation-clause Segments a projection evenly and distributes the data across nodes using a built-in hash function.
    range-segmentation-clause Allows you to segment a projection based on a known range of values stored in a specific column chosen to provide even distribution of data across a set of nodes, resulting in optimal query execution.
    UNSEGMENTED { NODE node | ALL NODES }
  • NODE node?reates an unsegmented projection only on the specified node. Projections for small tables must be UNSEGMENTED.
  • ALL NODES?utomatically replicates the unsegmented projection on each node. To perform distributed query execution, Vertica requires an unsegmented copy of each small table superprojection on each node
  • KSAFE [ k-num ]