I was watching a MapD YouTube channel and in that they said they save the data like columnar database and only put some data column in VRAM of GPU not all.So my question how you decide which column data is hot data and it should be in VRAM.
Which column are loaded into VRAM depends on what is reference where in the query.
- for column referenced not at all, of course no need
- for column referenced only in simple projection the values can be merged to result in CPU memory
- for column used in expression, aggregation, filter, grouping, ordering, these are brought into GPU for fast execution
One additional complexity is that each column chunk has metadata on contents (e.g. min/max value) which can be used to avoid loading the chunk at all.
This handling of what is kept in GPU and when it is released is handled all automatically.