This post is set up as a wiki, so that OmniSci employees and trusted community members can add information over time. If you have a suggested topic, please either ask the question below, or if you have sufficient permissions, add the content. Thanks!
What is OmniSci?
The OmniSci platform is designed to overcome the scalability and performance limitations of legacy analytics tools faced with the scale, velocity and location attributes of today’s big datasets. Those tools are collapsing, becoming too slow and too hardware-intensive to be effective in big data analytics. OmniSci is a breakthrough technology, originating from MIT, designed to leverage the massively parallel processing of GPUs alongside traditional CPU compute, for extraordinary performance at scale.
How do I get started using OmniSci?
At OmniSci, our mission is making analytics instant, powerful, and effortless for everyone. We provide numerous ways to install and get started with OmniSci:
- Tarballs for installation on Linux
- Docker containers
- Pre-built images on AWS, Azure and Google Cloud Platform
- OmniSci Cloud for those who want a managed SaaS experience
Regardless of how you install OmniSci, we want you to have an accelerated analytics experience that no one else can provide.
What Cloud Providers does OmniSci support?
OmniSci provides pre-built images for AWS, Azure and Google Cloud Platform, but OmniSci is flexible enough to be installed in nearly any cloud or on-premises data center. For help with deciding which version of OmniSci is most appropriate for your use-case, please see our OmniSci Downloads version chart that outlines the features of our various offerings.
What GPUs/hardware can I use?
Given the rapid innovation in data center hardware and cloud offerings, it’s often difficult to know exactly what hardware you should use with OmniSci to solve your business problems. OmniSci provides a Hardware Reference Guide to help customers with their hardware selection.
Of course, we’re happy to answer any questions you may have as part of this Installation and Getting Started forum, and we encourage the community to share hardware configurations they’ve had success with!
OmniSciDB optimizes the memory and compute layers to deliver unprecedented performance. OmniSciDB was designed to keep hot data in GPU memory for the fastest access possible. Other GPU database systems have taken the approach of storing the data in CPU memory, only moving it to GPU at query time, trading the gains they receive from GPU parallelism with transfer overheads over the PCIe bus.
Is OmniSci an in-memory database? What happens if my data doesn’t to fit into RAM?
OmniSciDB is a columnar, in-memory database. The columnar aspect means that OmniSci will only retain columns in-memory that it needs to return a query. However, this does not mean that OmniSci is limited by the amount of CPU or GPU RAM in your machine.
OmniSciDB uses an intelligent caching strategy to keep data “hot” as best as possible, by caching the most recently touched data in High Bandwidth Memory on the GPU. Keeping data in GPU RAM offers up to 10x the bandwidth of CPU RAM and far lower latency. OmniSciDB is also designed to exploit efficient inter-GPU communication infrastructure such as NVIDIA NVLink when available. When data does not fit completely in GPU RAM, OmniSciDB will swap data between CPU RAM and GPU RAM as needed, and when both CPU and GPU RAM are exhausted, then OmniSciDB will swap from disk as necessary.