Library

Launching OmniSci Open Source on GCP 

03-21-2019 23:31

Launching OmniSci Open Source on GCP


In this tutorial I will provide a step-by-step procedure for deploying OmniSci Open Source Edition software on Google Cloud Platform (GCP). These instructions assume that you have a Google Cloud Platform account. Check this instruction if you don’t have a GCP account.

Creating a Virtual Machine

On the GCP Console, select Compute Engine and select an existing project or create a new project.


1znv27CSt6o96qmx8VjA_gcp-create-project-2.jpeg
Select Create Instance for creating the OmniSci virtual machine (VM).


Enter a name for the VM, and select Customize under Machine Type to choose the flavor of the compute platform which will also impact the initial selection for the region & zone where the VM can be deployed.

This is a minimalistic configuration for running OmniSci, you can refer the Hardware Configuration Reference Guide for guidance on your production deployment. Change the Number of GPUs to 1 and from the GPU Type drop down menu select NVIDIA Tesla P4. As the P4 GPUs are available in us-west2 region and us-west2-c zone, make the changes accordingly. Also, increase the number of CPU cores to 4 and the memory to 15 GB.

Change the Boot disk, select CentOS7 (x86_64) and set the size of the boot disk to 20GB. The supported operating systems for OmniSci are CentOS/RHEL 7.0 or later and Ubuntu 16.04 or later.

Enable default access, allow HTTP/HTTPS traffic and create the VM.

Firewall Rules

Go to the Firewall rules page in the Google Cloud Platform Console. Click Create a firewall rule to allow access to the OmniSci API ports. The caption below shows the firewall settings to access OmniSci, for further details read GCP Firewall settings.

Installing OmniSci Open Source

After creating the virtual machine, it takes a couple of minutes for the VM to become available indicated by the green icon. You can access the VM using the external IP address associated with the network interface (nic0).

You can SSH in a browser window using the SSH drop down menu.

The document CentOS/RHEL 7 OS GPU installation with Yum has detailed instructions on installing OmniSci Enterprise using yum. Here is a summary of all the commands for the installation.
sudo yum update
sudo reboot
 
sudo yum install java-1.8.0-openjdk-headless
sudo yum install epel-release
sudo useradd -U -m omnisci
sudo yum install kernel-devel-$(uname -r) kernel-headers$(uname -r)
sudo reboot

Install Nvidia CUDA drivers.
sudo yum install wget
wget   https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-rhel7-10-1-local-10.1.105-418.39-1.0-1.x86_64.rpm
sudo rpm --install cuda-repo-rhel7-10-1-local-10.1.105-418.39-1.0-1.x86_64.rpm
sudo yum clean expire-cache
sudo yum install cuda-drivers
sudo reboot

Run nvidia-smi to verify the drivers are installed correctly and the GPU is recognized.
       nvidia-smi


Firewall settings on Centos.
sudo firewall-cmd --zone=public --add-port=6273/tcp --permanent
sudo firewall-cmd --reload

Modify the yum repository file to include OmniSci specification.
sudo vi /etc/yum.repos.d/CentOS-Sources.repo
[omnisci]
name='omnisci os - cuda'
enabled=1
gpgcheck=1
repo_gpgcheck=0
gpgkey=https://releases.omnisci.com/GPG-KEY-mapd
 
 
sudo yum install omnisci
cd ~/
vi .bashrc
# User specific aliases and functions
export OMNISCI_USER=omnisci
export OMNISCI_GROUP=omnisci
export OMNISCI_STORAGE=/var/lib/omnisci
export OMNISCI_PATH=/opt/omnisci
export OMNISCI_LOG=/var/lib/omnisci/data/mapd_log
source .bashrc

Run the script to create the database data directory in $OMNISCI_STORAGE, the subdirectories for catalogs, data, export and log will be created.
cd $OMNISCI_PATH/systemd
sudo ./install_omnisci_systemd.sh

Start the OmniSci daemons and also enable it at system startup.
cd $OMNISCI_PATH
sudo systemctl start omnisci_server
sudo systemctl enable omnisci_server

Load and Query Sample Dataset

Load the sample flights dataset with 7 million records (option #1) into the database.
cd $OMNISCI_PATH
sudo ./insert_sample_data

Connect to OmniSci Core with the SQL command line utility (default password is HyperInteractive) and run a query against the newly created flights_2008_7M table.

$OMNISCI_PATH/bin/omnisql
omnisql> SELECT origin_city AS "Origin", dest_city AS "Destination", AVG(airtime) AS "Average Airtime" FROM flights_2008_7M  
    WHERE distance < 175 GROUP BY origin_city, dest_city;


OmniSci Charting API

You can create charts for the flights dataset you just loaded using OmniSci charting API in  Javascript. OmniSci provides mapd-charting - a superfast charting library that is based on dc.js, and is designed to work with MapD-Connector and MapD-Crossfilter to create charts instantly using OmniSci's Core SQL Database as the backend. In mapd-charting, you will find sample code for creating frontend rendered charts like Bar, Line and Scatterplot.

NOTE: In OmniSci Open Source version the backend rendered charts (Pointmap, Linemap, Choropleth etc) are not supported.

The OmniSci Javascript API uses the web port 6273 for connecting to the backend database, so start the OmniSci web server and enable it across reboots.

sudo systemctl start omnisci_web_server
sudo systemctl enable omnisci_web_server

Install mapd-charting

Get the source code:
        git clone https://github.com/omnisci/mapd-charting.git

Install Dependencies:
cd mapd-charting
yarn install

Test a simple cross-filtered chart example:
cp example/example1.html example/index.html
yarn run start

Open a browser to http://localhost:8080, and you should see three cross-filtered charts that use the sample flights dataset. The function init() in example1.html shows how to connect to the OmniSci backend database, here it is connecting to the host metis.mapd.com.

Testing mapd-charting with your GCP instance

To test with your GCP instance use the following init() function but replace the host IP address with your instance external IP address. Also, notice that the database table name is set to flights_2008_7M.
function init() {
 
   /* Before doing anything we must set up a mapd connection,
    * Specifying username, password, host, port, and database name
    */
   new MapdCon()
     .protocol("http")
     .host("35.236.55.91")  // Connect to OmniSci Cloud Host
     .port("6273")
     .dbName("mapd")               // Default database
     .user("mapd")
     .password("HyperInteractive")
     .connect(function(error, con) {
       /*
        * This instaniates a new crossfilter.
        * Pass in mapdcon as the first argument to crossfilter,
        * then the table name, then a label for the data
        * (unused in this example).
        *
        * to see all availables --  con.getTables()
        */
        crossfilter.crossfilter(con,
            "flights_2008_7M").then(createCharts)
       /*
        * Pass instance of crossfilter into our createCharts.
        */
     });
 }

The three charts (bar, bubble and time line) use the columns (dest_state, carrier_name, depdelay, arrdelay, dep_timestamp) from the flights_2008_7M table. You can apply filters by clicking on a particular state on the bar chart, or selecting an airline on the scatter plot or dragging the time brush on the time chart. You will notice that all the charts will automatically redraw applying the selected filters using the crossfilter functions.

You can visit the OmniSci community forum to learn more and share your experience.

OmniSci is also available on Google Cloud Platform Marketplace, and offers three instance types based on Nvidia GPUs K80, P100 & V100.











#Cloud
#gcp

Statistics
0 Favorited
3 Views
1 Files
0 Shares
0 Downloads
Attachment(s)
pdf file
Launching OmniSci Open Source on GCP.pdf   1.44MB   1 version
Uploaded - 03-22-2019

Tags and Keywords

Related Entries and Links

No Related Resource entered.