Monday, 11 January 2010

Hybrid Row/Column Datastores

One of the things I wanted to talk about in more detail at PGday Europe in November was hybrid row/column stores for PostgreSQL. I've just read some more stuff about column stores and that's made me remember this again.

I'd read lots of hot air about how databases organised as column stores are much better than row stores and quite frankly I was getting a little tired of it. Sometime in 2007, I came up with a great solution but it really is taking quite a while to get going with this. Let's start with the background.

The column store argument is that if you separate out each column in a table into its own slice of data then this can be stored as bitmaps. These achieve very good compression ratios for each column. To answer a query you just link together all the individual columns required for the query and you've finished. So query cost is O(n) on the number of (compressed) columns in the query, rather than a row store for which data access is O(m) on the number of columns in the table. Column stores are great at doing things like count(*) but not very fast at retrieving data. So there are lots of queries that would show how good column stores can be, plus lots of examples of really bad queries as well. It all really depends upon what your exact query mix will be.

My main issue with column stores is how will you know ahead of time what your data warehouse query mix will be? If you are forced to pick one database product or another before you've even run a query then it seems a very risky decision. Would you base your data warehousing strategy on that? What we really want is a hybrid row/column datastore where you can have some data in row form and some data in column form.

Nobody seemed to have noticed that the column store data is exactly the same format as bitmap indexes on a single column. So, if we had a bitmap index on every column in a table we would be able to use exactly the same algorithm to extract data as column store databases do. The only implementation barriers would be that we don't yet have index-only scans, nor do we have the ability to have columns that are stored *only* in the index. This latter capability would mean that the heap row might actually hold only the visibility information and all other data would be held in indexes. I admit it sounds crazy, but it is exactly what column stores already do to extract data from the individual columns; we can do that also, it would be just an additional step in the executor to retrieve columns and assemble a result row.

Now the beauty of this approach is that you don't have to make choices between using a row and a column store database. You can choose, column-by-column if you wish, which columns will be "column-oriented" and which columns are "row-oriented", or you can have both.

So the things we need to make row/column hybrids work in PostgreSQL are

* Bitmap indexes
* Index-only scans
* Index-only columns

So core Postgres is at least one release away from that.

There are a couple of other thoughts alongside that as well. First is "join removal", now in 8.5, which is an optimizer technique for removing tables that we don't actually need to answer queries. I was very interested in that because it allows us to speed up queries by providing a mechanism for vertically partitioned tables - splitting a table into many tables each with a subset of columns. If you slice them up correctly then you avoid significant I/O when accessing tables when you only want a few of the columns. It was the first step in the right direction and I pursued it initially because it required only optimizer changes to make it work. I'll explain more about that on another post.

Last thought is about Greenplum and of course the new Single Node Edition. The latest version now has column-oriented storage as well as row oriented storage. So you have the option of choosing the design that will work best for your needs. It's a step towards where I'd like to see us go, plus GP is at least a year ahead of core Postgres in getting some form of column orientation. Initial indications are that it is faster also, though I've yet to do any objective performance tests. Been busy.


  1. Hi Simon,

    Just for info, mabe you don't know the following links.

    There was a discussion about performance on the performance-ML, see here:

    He compared PG with a performace-Test running with MySQL, see here:

    Afaik, both mysql-engines (InfoBright and MonetDB) are column-oriented, and because of this, MySQL is *much* faster than PG.

    I hope, i can help you a little bit with this information.

    Regards, Andreas

    PS.: see you in brussels, FOSDEM ;-)

  2. Thanks Andreas for those comments. I was already aware of the information posted on the performance list. There are two conclusions here that I disagree with.

    The explicit conclusion "(Infobright and MonetDB) are column-oriented and because of this, MySQL is *much* faster than PG...". It might appear obvious that column-orientation is the success factor here; I'm not completely convinced since things such as cacheing and compression seem more important. AFAICS the success factor seems more likely to be that the people who set the benchmark chose the queries and funnily enough they picked ones they do well on. (There is also the logical problem of equating Infobright/MonetDB with MySQL itself.)

    The implicit conclusion from all of this is that because somebody showed us some fast query results we would make a technology selection decision towards a data warehouse database that restricts us to choose only column-orientation. My vision is that people need hybrid approaches and ultimately we can include these concepts within Postgres.

    The bottom line here is that we already know that queries can be made faster by pre-processing the data into a form that speeds up those queries. What we all need are options that give us the flexibility to store the data in the physical form that suits the queries most. Postgres also has access to a mechanism to do that, which is to create summary tables when we load data. Which leads me towards Materialized Views...

  3. As you said, the compression is very important in this results, manly because the column store structure improves greatly the compression ratio on the data warehouse, because they have typically a lot of duplicates values in the column, and even an RLL compression generates a good result.
    Because this any columnar implementation in PG must use compression.
    Best regards