Database Scripts

This script will show you what is the size of you Database Catalog.

Use this information to set your General Resource Pool accordingly by keeping in mind that the Vertica Catalog is not loaded in the General Pool allocated Memory and is loaded into the memory that is outside the  General Pool allocated Memory threshold.

 

Vertica Data Collector

In this article i will go thru the steps of disabling the Data Collector in Vertica.

Data Collector is on by default. So we will make some changes to that as i am running this in my AWS Vertica Cluster and since is running on  on-demand EC2 schema i am really not interested in improving the performance of my database using historical system activities and counters.

The Data Collector can create extra overhead and i dont what this.

 

What is the Data Collector, in case you dont know ?

Is the utility that collects and retains database monitoring information.

Data Collector retains history of important system activities and records essential performance and resource utilization counters. You can use information the Data Collector retains in the following ways:

  • As a reference for what actions users have taken
  • To locate performance bottlenecks
  • To identify potential improvements to Vertica configuration

 

Lets start and see what are the steps to disable Data Collector.

See if the Data Collector is enabled.

  • we can see that we have it enabled.

To Disable the Data Collector we need to alter the value of the EnableDataCollector value to 0(zero).

The location to the where the Data Collector files are stored is usually in the  path of your Database Catalog.

Here is the content of my DataCollector.

  • it contains a bunch of .log files and .sql files.

Lets see how much space Data Collector is using

  • this will depend a lot on the Data Collector policy you have in place.

To Clear the Data Collector data we need to use the clear_data_collector() function.

  • this function will clear all the data from the disk and memory.

Let`s see how much space Data Collector is using after we run clear_data_collector() function.

Now that we have disabled Data Collector no data will generated and no overhead will be applied on your database.

Note:

Data Collector is very useful so we can analyze our Database overall performance so be careful before going ahead with the clean up.

Os tutoriais são destinados a mostrar como criar e apagar objetos de banco de dados,seleção de dados, filtragem de dados, classificação de dados, funções embutidas, tabelas de união, recolha de dados e muito mais. Eles vão te dar um bom começo para a codificação SQL e irá também ajudar a começar com a instalação e configuração de banco de dados Vertica básica. Para começar a aprender hoje, basta seguir os tutoriais sobre o menu do lado esquerdo.

In this tutorial we will learn how to export one table and restore it using vsql commands.

  • Backup:
  • In this step we will make use of “EXPORT_TABLES” Vertica native function that will export the DDL of our
    table to the file we tell it to buy using the “-o” vsql option. Make sure you use full path description.

    Next we will export the content of the table we choose to export and we zipp it in the same process.

    Time to restore our objects.

    First create the table by running the script. Next use the “COPY” command to load the table with the data
    from the file table_one.gz, good to remember that Vertica will work with zipped file.(gret option to save netwok IO).

  • Restore :
  • More on how to export and import objects in Vertica in the next tutorials.

    Vertica uses vbr.py tools that is written in python.
    This utility lets you create full and incremental database snapshots, as well as snapshots of specific
    schema’s or tables for use with a multitenant database.

    You can run the Vertica vbr.py from a crontab job or other task scheduler.

    Location where you can save the backups:

    • A local directory on the nodes in the cluster
    • One or more hosts outside of the cluster
    • A different Vertica cluster (effectively cloning your database)

    Snapshots create with the vbr.py tool requires the same to tool to be restored.
    As well as snapshots made with backup.sh tools are only to be restored with backup.sh tool.

    Snapshot Configuration Parameters.

    Parameter:

    • RemoveSnapshotInterval

    s the number of seconds that Vertica checks for snapshots that can be automatically removed.
    Default value -3600
    To alter this parameter :

    • SnapshotRetentionTime

    -How long a snapshot can exist before it can be automatically removed.
    Default value -3600
    To alter this parameter :

    VBR Terminology

     

    • Snapshots

     

    • A consistent image of all objects and data in the database at the time the snapshot is taken.
      -Object-level snapshots consist of a subset of database objects, selected by the user. They include other
      objects in the dependency graph, and associated data in the database at the time the object-level snapshot is taken.
      -We can name our snapshot as we like (blablasnap,1snap,fullsnap,etc)
    • Archive
    • A number of same-name snapshots, past and psent. Each archive can have a different retention policy.
      Example :
      If TSnap names a snapshot of table T, and you take the snapshot daily, keeping 7 snapshots in the archive would
      let you revert back to any of the last week’s snapshots of table T.
    • Backup location

    The directory location on a backup host where snapshots are saved. This location can comprise multiple snapshots,
    including associated archives. All snapshots in the same backup location share data files (through hard links). The
    snapshots are also compatible, meaning that after restoring a full database snapshot, any object snapshot from the same
    backup location is eligible to be restored.

    • Object-level snapshot

    One or more schemas or tables, or group of such objects, saved on a backup host. The conglomerate parts of the object-level snapshot do not contain the entire database.
    In earlier Vertica versions, object-level snapshots could not exist because a snapshot always contained the entire database.

    • Incremental backups

    A successive backup consisting only of new or changed data.

    • Selected objects

    The objects chosen to be part of an object-level snapshot. For example, if tables T1 and T2 are backed up in a snapshot, these objects comprise the selected objects.

    • Dependent objects

    Objects that should be part of an object-level snapshot due to their dependency. For example, a table with a foreign key can exist on its own, but must be backed up with the primary key table, due to table constraints. Projections anchored on a table in the selected objects are also dependent objects.

    • Principal objects

    The objects on which both selected and dependent objects depend. For instance, each table and projection has an owner. The owner is a principal object.

    When to Back-up the Database

    Whenever you choose to, depending on your needs

     

    • Before you upgrade Vertica to another release.
    • Before you drop a partition.
    • After you load a large volume of data.
    • If the epoch in the latest snapshot is earlier than the current ancient history mark .
    • Before and after you add, remove, or replace nodes in your database cluster.
    • After recovering a cluster from a crash.

     

    Note: When you restore a database snapshot, you must restore to a cluster that is identical to the one on which you created the snapshot. For this reason, always create a new snapshot after adding, removing, or replacing nodes.

    Configuring Backup Hosts

    You can backup you database in one or more location, being use full in creating offsite data backups.
    The backup hosts you use must:

     

    • Have sufficient backup disk space.
    • Be accessible from your database cluster.
    • Have password less SSH access for the database administrator account.
    • Have a copy of the same versions of Python and rsync that were installed by the Vertica installer.

     

    Configuring Single-Node Database Hosts for Backup

    Installing Vertica on a single-node database host automatically sets up the node with passwordless SSH access.
    The vbr.py utility requires that all database hosts (including single-node hosts) and backup location hosts have passwordless SSH access.

    Estimating Backup Host Disk Requirements

    Wherever you plan to save data backups, consider the disk requirements for incremental backups at your site.
    To see the total used space on your cluster:

    To see space used on your cluster separated for each node

    Generating the vbr.py Configuration File

    The vbr.py utility uses a configuration file for the information it requires to back up and restore a full- or object level snapshot.
    You cannot run vbr.py without a configuration file, and no default file exists.

    Creating a vbr Configuration File

    You can alter the generated file manually as well.

    Now to back-up the database follow the syntax :

    The command to restore the database is :

    Follow a full example of backup-destroy-restore(full/all objects) of a cluster using vbr.py tool:

    1 -see cluster state.
    2 -see cluster objects
    3 -backup the cluster with the full_backup.ini script
    4 -drop all objects in the cluster
    5 -stop the cluster database
    6 -restore the cluster database
    7 -start cluster database
    8- Verify cluster database content

    And there you go we have restored our cluster database successfully.

    Generating Different types of vbr.py Configuration File
    To create a configuration file:

    • 1. From a command terminal, enter vbr.py command:

    Sample Session Configuring Required Parameters:
    We are going to describe each parameter of the vbr.py backup configuration file:
    Note you can have as many backup configuration file as you need (full/incremental, objects/schema specific, location specific).

    We saved the vbr.py configuration to exampleBackup.ini in the actual work directory, or if
    you specify the full path when setting config file name
    Example:

    Note:
    Directory ackupConfdoes not exists by default you can create one for the purpose of a better organization.

    Configuring Advanced VBR Parameter
    To use advanced settings in configuring the backup file choose when:

    Examples of backup config files:
    (All examples are made on a 3 node cluster called luster)
    Before all we must choose our backup host and create specific directories that can be written by the user who running the backup/restore duties.

    We will chose our monitoring host for the backup storage (Vertica_Master1)
    And we will create the directories needed.

    All directories are created now let us create the backup config files:

    1-Full database backup config file with 1 level of retention (will be overwritten every time a backup occurs).

    See content of the full_backup.ini file

    2-Full/Incremental database backup config file with more than one level of retention (values 1-99).

    See content of the backup_incremental_7.ini file

    3- Object specific database backup config file (as well can have more levels of retention).

    Chose some object/objects in the database to backup only

    We will backup table public.one

    See content of the backup_table_one.ini file

    Testing our backup configuration files
    To run backup on Vertica using the configuration file created use the following sintax:

    To run restore a Vertica backup using the configuration file created use the following sintax:

    NOTE:
    -to restore/recover a full database you need to have the database shutdown.

    Example of test backups

    Full backup

    The backup files were generated in the specified location

    Incremental backup

    The backup files were generated in the specified location

    Single object backup

    The backup files were generated in the specified location

    Verify backup location

    So backup were created and placed in the supplied directories during backup configuration file.

    Restoring the backup of a database from an actual backup

    Syntax:

    Let’s drop all the objects in our database and restore it with our recently made backup

    Great, now stop the database so we can restore it. Other-wise you will get an error telling you :

    So the database is down.

    Now restore the database:

    Startup the database and check that your objects are available :

    Now let simulate the loss of a table and in this case we will restore the single backup option:

    Note:For this restore database must be up and running

    Now check that the object is in the database:

    Great out table is back and ready to use.

    Automating Regular Backups

    The vbr.py utility helps to automate backing up you database, because you can configure vbr with the required run-time parameters. The ability to configure runtime parameters facilitates adding the utility to a cron or other task scheduler to fully automate regular backing up your database.
    Example of crontab syntax:

    How i put it inside my crontab.

    So our crontab job will do a full back-up every day at 1 am and will write the output into a log file that resides into

     

    Displays run-time parameters for the current. No special permission is required to be able to do this action.

    Syntax



    Parameters

    Table of parameters and their description:

    AUTOCOMMIT

    Displays whether statements automatically commit their transactions when they complete.

    AVAILABLE_ROLES

    Lists all roles available to the user.

    DATESTYLE

    Displays the current style of date values

    ENABLED_ROLES

    Displays the roles enabled for the current session.

    ESCAPE_STRING_WARNING

    Displays whether warnings are issued when backslash escapes are found in strings.

    INTERVALSTYLE

    Displays whether units are output when printing intervals.

    LOCALE

    Displays the current locale.

    MEMORYCAP

    Displays the maximum amount of memory that any request use.

    RESOURCE_POOL

    Displays the resource pool that the session is using.

    RUNTIMECAP

    Displays the maximum amount of time that queries can run in the session.

    SEARCH_PATH

    Displays the order in which Vertica searches schemas.

    SESSION_CHARACTERISTICS

    Displays the transaction characteristics.

    STANDARD_CONFORMING_STRINGS

    Displays whether backslash escapes are enabled for the session..

    TEMPSPACECAP

    Displays the maximum amount of temporary file space that queries can use in the session.

    TIMEZONE

    Displays the timezone set in the current session.

    TRANSACTION_ISOLATION

    Displays the current transaction isolation setting.

    TRANSACTION_READ_ONLY

    Displays the current setting.

    ALL

    Shows all run-time parameters.



    Examples of use of “SHOW” command in Vertica:

  • See timezone parameter value:
  • See resource_pool that is in use by it’s session:
  • Syntax used to restart a Vertica node.

    During a full database restore, the node must be DOWN. If you start the restore process and the node is UP, vbr.py displays the following message:

     

    • Restoring from a Specific Snapshot

    The snapshot has to exist before doing this.,

    Syntax:

    Example:

    Note: the archive must corresponds with the type of backup done.

    You will restore a snapshot created on 2012/09/06 at 15:26:40 sec.
    The archive option need to receive the date_timestamp parameter that is appended to the snapshots archive.

    Restoring From the Most Recent Snapshot
    Syntax:

    Example:

  • Replication the Vertica process of storing identical copies of data across all nodes in a cluster.
  • Replication

  • will offer H.A.(high avalibility) to our cluster by creating buddy projections across all nodes.
  • this way every node will have a copy of the other node projections enabling us to recover from a node loss.
  • Replication creates and stores identical copies of projections for small tables across all nodes in the cluster.
  • Replication is best used for a small projection, copy the full projection to each node.

    Example :

    Vertica Replication

    Vertica Replication

    In the image we can see that we have 4 projections that are replicated across all nodes.

    Here is the table with all tha data types used in Vertica database.

    Vertica Data Types Length (Range) Description
    BINARY DATA TYPES
     BINARY  1 to 65000 Fixed-length binary string
     VARBINARY  1 to 65000 Variable-length binary string
     BYTEA  1 to 65000  Variable-length binary string (synonym for VARBINARY)
     RAW  1 to 65000 Variable-length binary string (synonym for VARBINARY)
    BOOLEAN DATA TYPES
     BOOLEAN  1  True or False or NULL
    CHARACTER DATA TYPES
     CHAR 1 to 65000  Fixed-length character string
     VARCHAR  1 to 65000  Variable-length character string
    DATE/TIME DATA TYPES
     DATE 8 Represents a month, day, and year
     DATETIME 8 Represents a date and time with or without timezone (synonym for TIMESTAMP)
     SMALLDATETIME  8 Represents a date and time with or without timezone (synonym for TIMESTAMP)
     TIME  8 Represents a time of day without timezone
     TIME WITH TIMEZONE 8 Represents a time of day with timezone
     TIMESTAMP 8 Represents a date and time without timezone
     TIMESTAMP WITH TIMEZONE 8 Represents a date and time with timezone
     INTERVAL 8 Measures the difference between two points in time
    APPROXIMATE NUMERIC DATA TYPES
     DOUBLE PRECISION  8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage
     FLOAT  8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage
     FLOAT(n)  8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage
     FLOAT8  8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage
     REAL  8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage
    EXACT NUMERIC DATA TYPES
     INTEGER  8 Signed 64-bit integer, requiring 8 bytes NULLS FIRST of storage
     INT  8 Signed 64-bit integer, requiring 8 bytes NULLS FIRST of storage
     BIGINT  8 Signed 64-bit integer, requiring 8 bytes NULLS FIRST of storage
     INT8  8 Signed 64-bit integer, requiring 8 bytes NULLS FIRST of storage
     SMALLINT  8 Signed 64-bit integer, requiring 8 bytes NULLS FIRST of storage
     TINYINT  8 Signed 64-bit integer, requiring 8 bytes NULLS FIRST of storage
     DECIMAL  8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits
     NUMERIC  8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits
     NUMBER  8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits
     MONEY  8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits

    datatype

    What is a projection ?

  • is the Vertica phisical layer from wich data is queried.
  • projections are automaticly refreshed when new data is loaded or modified.
  • data is encoded and compressed in projections .
  • each projection stores a separate copy of data even if data is from the same table .
  • Projections are able to mantaing redundant copies on one or more cluster nodes of data
    to obtain high availability .
    Projection can also contain partitioned data .
    Vertica can contain as many projection as needed to provide optimization for all queries.

  • some projection might contain the same data or similar data as other projections but may be sorted in a different way for optimization reasons.
  • Types of projections

    Super Projection

  • contains all the column from a individual table.
  • they are contained automatilly after table are created and data was loaded.
  • Pre-Join Projection

  • it stores the result of join between Fact and a Dimension Table
  • created from denormalized tables .
  • join is auto
  • refreshed as data is modified in the schema.
  • Query-Specific Projections

  • created by a subset of column from a table that has been specified for a specific query .
  • Buddy Projections

  • they contain the same columns and segmentation as another projection.
  • they located on multiple nodes to offer high availability
  • Way to create projections

  • the moment the table is loaded the super-projection is created.
  • un-optimized projections are created while initial load.
  • when optimized projections are crated the un-optimized projections
    will be dropped .
  • manually , using the vsql CREATE PROJECTION command.

  • Share this with everybody

    A profile controls:

  • How often users must change their passwords.
  • How many times users must change their passwords before they can reuse an old password.
  • How many times users can fail to log in before their account is locked.
  • The required length and content of the password (maximum and minimum amount of characters
    and the minimum number of letters, capital letters, lowercase letters, digits, and symbols that must be in a password).
  • Profiles are need to enforce diferent classes of users and diferent policies depending on the user/group you asign them to.
    If no profile is asigned to the user when created the DEFAULT profile will be asigned to him.
    Syntax:



    Example:

    The combinations are unlimited and mostly depend on you application requirements.

    Vertica Database Articles

    Drop user will simply drop a user form the database.

    Syntax:

    The CASCADE option will drop all objects created by the user dropped, including schema, table and all views that reference the table, and the table’s associated projections.

    Example:

    If the user that is droped has objects owned in the database and the CASCADE option is not used an error will be received.

    Alter user Vertica

    Users can only be altered by the admin user. The changes applied on the user will only apply in the future sessions not in the current session.

    Changes that can be made by regular users:

  • can change their own passowrd using the “IDENTIFIED BY” commnad.
  • can change their default “RESOURCE POOL” to any other as long as they have been granted access to it.
  • they can alter their search path.
  • The syntax to alter an user is:

    Examples:

  • Unlock user account:


  • Change user passowrd:
  • Unlock user account:
  • This are some of the examples on how we can alter an user account.

    Syntax

    Example:

    Newly-created users do not have access to schema PUBLIC by default. Make sure to GRANT USAGE ON SCHEMA PUBLIC to all users you create.

    To do so give to them the following grant:

    In Vertica new created users have the right to create temporary tables in the database, so no special wrights are needed.

    The DELETE command in Vertica does not delete data imediateley, but it marks records as deleted data with delete vectors.

    Delete statement marks tuples as no longer valid in the current epoch, marking the records for deletion in the WOS(Write Optimized Store),
    rather than deleting data from disk storage.



    Pros and cons when using delete:

    Pros

    • marked rows as deleted so that they remain available to historical queries.
    • delete process can be long and costfull so saving time and resources at the time we chose the deleted data is important.

    Cons

    • uses disk space for the deleted rows and delete markers(as Vertica will write a new file that will point out to the deleted data).
    • performance penalty when reading and skipping over deleted data.

    How to permanently remove data from Vertica ?

    The purge operation permanently removes deleted data from physical storage so that the disk space can be reused.

    How to use purge ?

    • Setup a purge policy or purge data manually.

    To purge data manually Vertica offers the following functions:

    • PURGE_PROJECTION() purges a specified projection.
    • PURGE_TABLE() purges all projections on the specified table.
    • PURGE() purges all projections in the physical schema.

    purge

    In this tutorial we will learn how to export one table and restore it using vsql commands.

  • Backup:
  • In this step we will make use of “EXPORT_TABLES” Vertica native function that will export the DDL of our
    table to the file we tell it to buy using the “-o” vsql option. Make sure you use full path description.

    Next we will export the content of the table we choose to export and we zipp it in the same process.



    Time to restore our objects.

    First create the table by running the script. Next use the “COPY” command to load the table with the data
    from the file table_one.gz, good to remember that Vertica will work with zipped file.(gret option to save netwok IO).

  • Restore :
  • More on how to export and import objects in Vertica in the next tutorials.

    Vertica Database Articles

    Vertica uses vbr.py tools that is written in python.
    This utility lets you create full and incremental database snapshots, as well as snapshots of specific
    schemas or tables for use with a multitenant database.

    You can run the Vertica vbr.py from a cron job or other task scheduler.

    Location where you can save the backups:

    • A local directory on the nodes in the cluster
    • One or more hosts outside of the cluster
    • A different Vertica cluster (effectively cloning your database)

    Snapshots create with the vbr.py tool requires the same to tool to be restored.
    As well as snapshots made with backup.sh tools are only to be restored with backup.sh tool.

    Snapshot Configuration Parameters.

    Parameter:

    • RemoveSnapshotInterval

    Is the number of seconds that Vertica checks for snapshots that can be automatically removed.
    Default value -3600
    To alter this parameter :

    • SnapshotRetentionTime

    -How long a snapshot can exist before it can be automatically removed.
    Default value -3600
    To alter this parameter :

    VBR Terminology

    • Snapshots

    – A consistent image of all objects and data in the database at the time the snapshot is taken.
    -Object-level snapshots consist of a subset of database objects, selected by the user. They include other
    objects in the dependency graph, and associated data in the database at the time the object-level snapshot is taken.
    -We can name our snapshot as we like (blablasnap,1snap,fullsnap,etc)

    • Archive
    • A number of same-name snapshots, past and sent. Each archive can have a different retention policy.
      Example :
      If TSnap names a snapshot of table T, and you take the snapshot daily, keeping 7 snapshots in the archive would
      let you revert back to any of the last week’s snapshots of table T.
    • Backup location

    The directory location on a backup host where snapshots are saved. This location can comprise multiple snapshots,
    including associated archives. All snapshots in the same backup location share data files (through hard links). The
    snapshots are also compatible, meaning that after restoring a full database snapshot, any object snapshot from the same
    backup location is eligible to be restored.

    • Object-level snapshot

    One or more schemas or tables, or group of such objects, saved on a backup host. The conglomerate parts of the object-level snapshot do not contain the entire database.
    In earlier Vertica versions, object-level snapshots could not exist because a snapshot always contained the entire database.

    • Incremental backups

    A successive backup consisting only of new or changed data.

    • Selected objects

    The objects chosen to be part of an object-level snapshot. For example, if tables T1 and T2 are backed up in a snapshot, these objects comprise the selected objects

    • Dependent objects

    Objects that should be part of an object-level snapshot due to their dependency. For example, a table with a foreign key can exist on its own, but must be backed up with the primary key table, due to table constraints. Projections anchored on a table in the selected objects are also dependent objects.

    • Principal objects

    The objects on which both selected and dependent objects depend. For instance, each table and projection has an owner. The owner is a principal object.

    When to Back-up the Database ?

    Whenever you choose to, depending on your needs

    • Before you upgrade Vertica to another release.
    • Before you drop a partition.
    • After you load a large volume of data.
    • If the epoch in the latest snapshot is earlier than the current ancient history mark .
    • Before and after you add, remove, or replace nodes in your database cluster.
    • After recovering a cluster from a crash.

    Note: When you restore a database snapshot, you must restore to a cluster that is identical to the one on which you created the snapshot. For this reason, always create a new snapshot after adding, removing, or replacing nodes.

    Configuring Backup Hosts

    You can backup you database in one or more location, being use full in creating offsite data backups.
    The backup hosts you use must:

    • Have sufficient backup disk space.
    • Be accessible from your database cluster.
    • Have password less SSH access for the database administrator account.
    • Have a copy of the same versions of Python and rsync that were installed by the Vertica installer.

    Configuring Single-Node Database Hosts for Backup
    Installing Vertica on a single-node database host automatically sets up the node with passwordless SSH access.
    The vbr.py utility requires that all database hosts (including single-node hosts) and backup location hosts have passwordless SSH access.

    Estimating Backup Host Disk Requirements
    Wherever you plan to save data backups, consider the disk requirements for incremental backups at your site.
    To see the total used space on your cluster:

    To see space used on your cluster separated for each node :

    Generating the vbr.py Configuration File
    The vbr.py utility uses a configuration file for the information it requires to back up and restore a full- or object level snapshot.
    You cannot run vbr.py without a configuration file, and no default file exists.

    Creating a vbr Configuration File

    You can alter the generated file manually as well.

    Now to back-up the database follow the sintax :

    The command to restore the database is :

    Follow a full example of backup-destroy-restore(full/all objects) of a cluster using vbr.py tool:

    1- see cluster state.
    2- see cluster objects
    3- backup the cluster with the full_backup.ini script
    4- drop all objects in the cluster
    5- stop the cluster database
    6- restore the cluster database
    7- start cluster database
    8- Verify cluster database content

    And there you go we have restored our cluster database successfully.

    Generating Different types of vbr.py Configuration File

    To create a configuration file:

    • 1. From a command terminal, enter vbr.py command

    Sample Session Configuring Required Parameters:
    We are going to describe each parameter of the vbr.py backup configuration file:
    Note you can have as many backup configuration file as you need (full/incremental, objects/schema specific, location specific).

    Saved vbr.py configuration to exampleBackup.ini in the actual work directory, or if
    you specify the full path when setting ?onfig file name?
    Example:

    Note:
    Directory BackupConf does not exists by default you can create one for the purpose of a better organization.

    Configuring Advanced VBR Parameter
    To use advanced settings in configuring the backup file choose when:

    Examples of backup config files:
    (All examples are made on a 3 node cluster called ?luster?
    Before all we must choose our backup host and create specific directories that can be written by the user who? running the backup/restore duties.

    We will chose our monitoring host for the backup storage (Vertica_Master1)
    And we will create the directories needed.

    All directories are created now let? create the backup config files:

    1-Full database backup config file with 1 level of retention (will be overwritten every time a backup occurs).

    See content of the full_backup.ini file

    2-Full/Incremental database backup config file with more than one level of retention (values 1-99).

    See content of the backup_incremental_7.ini file

    3- Object specific database backup config file (as well can have more levels of retention).
    Chose some object/objects in the database to backup only

    We will backup table public.one

    See content of the backup_table_one.ini file

    Testing our backup configuration files
    To run backup on Vertica using the configuration file created use the following sintax:

    To run restore a Vertica backup using the configuration file created use the following sintax:

    NOTE:
    -to restore/recover a full database you need to have the database shutdown.

    Example of test backups:
    Full backup

    The backup files were generated in the specified location

    Incremental backup

    The backup files were generated in the specified location

    Single object backup

    The backup files were generated in the specified location

    Verify backup location:

    So backup were created and placed in the supplied directories during backup configuration file.

    Restoring the backup of a database from an actual backup

    Syntax:

    Let’s drop all the objects in our database and restore it with our recently made backup

    Great, now stop the database so we can restore it. Other-wise you will get an error telling you :

    So the database is down. Now restore the database:

    Startup the database and check that your objects are available :

    Now let’s simulate the loss of a table and in this case we will restore the single backup option:

    Note:For this restore database must be up and running

    Now check that the object is in the database:

    Great out table is back and ready to use.

    Automating Regular Backups

    The vbr.py utility helps to automate backing up you database, because you can configure vbr with the required run time parameters. The ability to configure run-time parameters facilitates adding the utility to a cron or other task scheduler to fully automate regular backing up your database.
    Example of crontab syntax:

    How i put it inside my crontab.

    So our crontab job will do a full back-up every day at 1 am and will write the output into a log file that resides into /vert_backup/backup_logs/cronjobs.log

    images

    Displays run-time parameters for the current. No special permision is required to be able to do this action.
    Syntax

    Parameters



    Table of parameters and their description:

    AUTOCOMMIT

    Displays whether statements automatically commit their transactions when they complete.

    AVAILABLE_ROLES

    Lists all roles available to the user.

    DATESTYLE

    Displays the current style of date values

    ENABLED_ROLES

    Displays the roles enabled for the current session.

    ESCAPE_STRING_WARNING

    Displays whether warnings are issued when backslash escapes are found in strings.

    INTERVALSTYLE

    Displays whether units are output when printing intervals.

    LOCALE

    Displays the current locale.

    MEMORYCAP

    Displays the maximum amount of memory that any request use.

    RESOURCE_POOL

    Displays the resource pool that the session is using.

    RUNTIMECAP

    Displays the maximum amount of time that queries can run in the session.

    SEARCH_PATH

    Displays the order in which Vertica searches schemas.

    SESSION_CHARACTERISTICS

    Displays the transaction characteristics.

    STANDARD_CONFORMING_STRINGS

    Displays whether backslash escapes are enabled for the session..

    TEMPSPACECAP

    Displays the maximum amount of temporary file space that queries can use in the session.

    TIMEZONE

    Displays the timezone set in the current session.

    TRANSACTION_ISOLATION

    Displays the current transaction isolation setting.

    TRANSACTION_READ_ONLY

    Displays the current setting.

    ALL

    Shows all run-time parameters.

    Examples of use of “SHOW” command in Vertica:

  • See timezone parameter value:


  • See resource_pool that is in use by it’s session:
  • This action removes a schema from the database permanently.

    Syntax

    Parameters options:

  • db-name – is the database name where the schema resides
  • schema – schema to be droped.
  • CASCADE – drop the schema and the objects it contains
  • RESTRICT – only drops tha schema if empty (default)
  • We cannot drop a public schema

    Transactions using the objects inside the schema must complete before the schema can be droped.

    Schema owner can drop a schema even if the owner does not own all the objects within the schema. All the objects within the schema are also dropped.

    Examples

  • Default drop :
  • Drop schema and all objects :
  • Syntax

    Parameters

  • [db-name.] -the current database name
  • schema-name -the name of one or more schemas to rename.
  • RENAME TO -new schema names.
  • Examples how to alter a schema in Vertica

    <

  • Alter the name of one schema:
  • Alter the name of more schemas at once:
  • How to create schema in Vertica ?

    In Vertica like in other database the “SCHEMA” is a part of the Logical Schema.

    Why the need of schemas ?

  • Allows Many users to access the database without interfering with one another.
  • Individual schemas can be configured to grant specific users access to the schema and its tables while restricting others.
  • Allows other applications to create tables that have the same name in different schemas, preventing table collisions.
  • The user who can create a schema is the superuser or a user that received the right to create one.

    Syntax to create a schema

    Where the options are:

  • [ IF NOT EXISTS ] – Will generate an notice if object exist.
  • [db-name.] – Specifies the current database name.
  • schema – Specifies the name of the schema to create
  • AUTHORIZATION user-name – Assigns ownership of the schema to a user. Only a Superuser is allowed to create a schema that is owned by a different user.
  • Examples of creating a schema in Vertica:

  • Create schema :


  • Create a schema “IF NOT EXISTS” and schema does not exist:
  • Create a schema with the “IF NOT EXISTS ” and schema exist:
  • Create a schema where the tables owner will be Martin:
  • In the next tutorial we will see how we can alter a schema.

    Vertica optimizes the database by distributing data across cluster using Segmentation.

    What segmentation does:

    • Segmentation -places a portion of data on a node .
    • will evenly distribute data on all nodes.
    • this way each node will perform a piece of the querying process.
      The query will be run on the cluster and every node will receive
      the query plan,the results of the queries are aggregated and used to create
      the output.

    The job of segmentation is to distribute data evenly across multiple database nodes so that all nodes can participate in query execution.

    segmentation

    In the image we can see an example of data segmentation that Vertica implements. We can see that data is segmented by hash
    on all nodes. The distribution of data into blocks across the cluster will enable fast query processing.

    To import data from a database we can use the copy from command as described here

    -make sure that the object already exists(the DDL) on the database, if not get the DDL of it with the function and create on the new database, to do so use the following commands:

    After this run the output on to the target database.



  • Test the ” CONNECT ” command:
  • -the connect command enables you to connect to another Vertica database that will subsequently enable export and copy from commands.
    Syntax of the CONNECT command is :

    Where the options are :

  • database-the target database name;
  • user-username used in to connect;
  • password-the password of the user;
  • host-hostname or ipaddress of one of the nodes;
  • port-the port on which the Vertica service is running on;
  • You cannot open more then one connection to at the same time, and the connection will be available until you use the DISCONECT command.

  • Test the ” DISCONECT ” command:
  • Closes a previously-established connection to another Vertica database.
    Syntax of the DISCONECT command is :



    Now back to our COPY FROM command, and see how this works.

    Syntax for the COPY FROM command:

    To be able to run this command you need to have the following permisions:

  • SELECT privileges on the source table.
  • USAGE privilege on source table schema.
  • INSERT privileges for the destination table in target database.
  • USAGE privilege on destination table schema.
  • Example showing how to copy several columns from a table in the source database into a table in the local database.

    Each host has a system setting for the Linux environment variable LANG. LANG determines the locale category for native language, local customs, and coded
    character set in the absence of the LC_ALL and other LC_ environment variables. LANG can be used by applications to determine which language to use for
    error messages and instructions, collating sequences, date formats, and so forth.

    To change the LANG setting for the database administrator, edit /home/dbadmin/.profile on all cluster hosts and set the environment variable; for example:

    The LANG setting controls the following in Vertica:

    OS-level errors and warnings. For example, “file not found” during COPY operations.

    The LANG setting does not control the following:

    Vertica-specific error and warning messages. These are always in English at this time.

    Set the locale: (if more than one node, set this in the .profile, .bashrc of the dbadmin user, or just set it as the system wide locale)



    Start Vertica.

    On vsql:

    currency

    Vertica Database Articles

    currency These instructions assume you have already made the standard OS changes to change the IPs (for example, updating /etc/hosts). These instructions detail only the Vertica-specific IP changes.

    To change the IP address of one or more nodes in a cluster:

  • Backup the following three files:


  • 1. Stop Vertica on all nodes.
  • 2. As root, on each node stop spread by using the following command:
  • 3. On each node edit /opt/vertica/config/admintools.conf and change the IPs as required.
  • You can use sed to change each IP in the file, for example to change 10.10.10.81.82 to 192.168.150.108 issue the command:



  • 4. On each node edit /opt/vertica/config/vspread.conf:
  • a. Change the old IPs to the new IPs as required.

    b. Locate the N number for each IP and change it to match the new IP. For example, if the old IP is 10.10.81.8, then the corresponding N number is N010010081008. N numbers consist of 3 digits for each IP number segment, padded with zeros when appropriate (10 becomes 010, 8 become 008, etc.). If the new IP address is 192.168.150.255, then the new corresponding N number is N010010081009.

  • 5. On each node edit /etc/sysconfig/spreadd and change the N numbers to the new N numbers as you changed in vspread.conf.

  • 6. As root, on each node start spread by using the following command:


  • 7. Start a single Vertica node and run vsql.

  • 8. In vsql, issue the following query to verify the new IP has been updated:
  • You can also verify IPs with the following shell commands:

  • Modify the database to use the new IPs for reporting node status:
    • In vsql, issue the command select node_name, node_address from v_catalog.nodes; to show you the current node names configured.


    • For each result, update the IP address by issuing the command: alter node NODE_NAME is hostname ‘new.ip.address‘; where NODE_NAME is
      the node_name and new.ip.address is the new IP of that node. Note: the node must be down to change the IP using the alter node
      command. You can change all the IPs for the cluster from the first node if only the first node is up, then bring down the first node and bring up
      another node to change the IP of the first node.
  • Bring up all nodes in the cluster.
  • Example

    Start the spread on all nodes now then start vertica only on one node and alter the host ip address in the

    First of many tutorials on how we can write administration scripts using the VSQL command line tool and it`s options.

    We will start with basic options:
    Connect to the database :

    Download Vertica Free Hardware Setup Manual from here:

    Vertica Free Hardware Setup Manual

    How to Connect to a specific node/host form the command line
    Where “-h” option will be followed by host name or ip address.

    Now change the node/host

    Where “-h” option will be follwed by host name or ip address.

    Connect to specific database:
    Where “-d” option will be followd by database name.

    Run a single command from the vsql line:
    Where “-c” option will be followd by the command you want to be executed.
    -command will execute and exit

    Redirect output to a file

    Execute commands from file, then exit

    How to List all available databases from vsql.

    This are just few of the many examples of how Vsql command line can be used.

    We will see some options parameters that can be used with Vertica vsql client utility
    The basic syntax is :

    General options:

    • -d DBNAME specify database name to connect to (default: “dbadmin”)
    • -c COMMAND run only single command (SQL or internal) and exit
    • -f FILENAME execute commands from file, then exit
    • -l list available databases, then exit
    • -v NAME=VALUE set vsql variable NAME to VALUE
    • -X do not read startup file (~/.vsqlrc)
    • –help show this help, then exit
    • –version output version information, then exit

    Input and output options:

    • -a echo all input from script
    • -e echo commands sent to server
    • -E display queries that internal commands generate
    • -q run quietly (no messages, only query output)
    • -o FILENAME send query results to file (or |pipe)
    • -n disable enhanced command line editing (readline)
    • -s single-step mode (confirm each query)
    • -S single-line mode (end of line terminates SQL command)

    Output format options:

    • -A unaligned table output mode (-P format=unaligned)
    • -b beep on command completion
    • -H HTML table output mode (-P format=html)
    • -t print rows only (-P tuples_only)
    • -T TEXT set HTML table tag attributes (width, border) (-P tableattr=)
    • -x turn on expanded table output (-P expanded)
    • -P VAR[=ARG] set printing option VAR to ARG (see pset command)
    • -F STRING set field separator for unaligned output (default: “|”) (-P fieldsep=)
    • -R STRING set record separator (default: newline) (-P recordsep=)

    Connection options:

    • -h HOSTNAME database server host or socket directory (default: “local socket”)
    • -p PORT database server port (default: “5433”)
    • -U NAME database user name (default: “dbadmin”)
    • -W prompt for password (should happen automatically)
    • -w PASSWORD database user’s password

    For examples on how we can write scripts that can be used as the Vsql command line in the next tutorial.

    All the events are recorded in the vertica.log file .(this is the first place to look when troubleshooting comes)

    Types & Modes of logs :
    Vertica can enable more types of logs using the “select set_debug_log(‘TM’,’ALL’)” command.

  • admintools log.
  • network logs.
  • db designer logs.
  • copy error logs.
  • Managing sessions

  • To see all active sessions:
  • To close all session on DB:
  • To close a specific session on DB:
  • To close all connections from clients and just allow acess to dbadmin connections
  • Managing locks

  • To see all active locks on DB run :
  • Description of Lock types :

    S – share lock needed for select operations(serializabale mode only)
    I – insert lock
    X – exclusive lock
    T – tuple mover lock , used by Tuple Mover or Copy command
    U – usage locks
    O – owner locks

    Managing performance issues

    Query plans

  • – they indicate data flow during execution .
  • – verifiys that projections are used in query.
  • How can we get the query plan ?
    -adding the “EXPLAIN” command before the query is to be analized.
    -this way an output will be provided with the optimezer decisions.
    Ex:

    Query profileing

  • – will show where time is spent during a query .
  • – will identify the resource pools used .
  • – indicates the memory required by the process.
  • How can we get the query profileing ?
    -adding the “PROFILE” command before the query is to be analized.
    Ex:

    – an exit with the min memory req will be output to estimate the query needs.

    Spread service

  • – spread is an open source toolkit used in Vertica to perform high perf messaging .
  • – each node will have spread installed during database installation process.
  • – do not alter spread config.
  • Checking if spread is running :

    Checking spread status :

    Start spread service :

    In this tutorial we will learn how to export one table and restore it using vsql commands.

  • Backup:
  • In this step we will make use of “EXPORT_TABLES” Vertica native function that will export the DDL of our
    table to the file we tell it to buy using the “-o” vsql option. Make sure you use full path description.

    Next we will export the content of the table we choose to export and we zipp it in the same process.

    Time to restore out objects.

    First create the table by running the script. Next use the “COPY” command to load the table with the data
    from the file table_one.gz, good to remember that Vertica will work with zipped file.(gret option to save netwok IO).

  • Restore :
  • More on how to export and import objects in Vertica in the next tutorials.

    Most of Vertica’s administrative tasks are done using Administration Tools(admintools tool).

    To run the admintools you need to use the user account that owns the database catalog and data storage on disk.

    Syntax

    Parameters