Thursday, 28 January 2010

9.0 Replication Features

We're getting close to deadline now for the next release of PostgreSQL.

Tensions running high as usual.

My concerns at this stage are always that we get some rounded features in before the doors close. It's always a shame to ship software exactly on time if that misses out on a few essential extras. Mind you without a hard deadline things do tend to slip for weeks and months on any project, not just PostgreSQL.

We're all working hard on the various replication features while we still can.

Greg Smith's been doing a great job improving pg_standby for use in file-based WAL shipping. I'm working on conflict resolution and other essential internals for Hot Standby. Hannu Krosing has been looking at improvements in Londiste replication, though that's not strictly tied to the release of PostgreSQL core.

We'll get there.

Monday, 18 January 2010

Mating Elephants

I've just discovered that we'll not be including Synchronous Replication in the next version of PostgreSQL, which is a surprise after so much work. (We will have streaming replication, however).

Remembering that the PostgreSQL symbol is an elephant (apart from in Japan), there is an old office joke that seems strangely appropriate. It goes like this:

Getting anything done around here is like mating elephants
* Everything gets decided at a high level
* There's a lot of stomping and trumpeting
* And it takes two years to get any results

Anyway, I'm sure we'll get there in the end so we can get multiple elephants synchronised.

Monday, 11 January 2010

Hybrid Row/Column Datastores

One of the things I wanted to talk about in more detail at PGday Europe in November was hybrid row/column stores for PostgreSQL. I've just read some more stuff about column stores and that's made me remember this again.

I'd read lots of hot air about how databases organised as column stores are much better than row stores and quite frankly I was getting a little tired of it. Sometime in 2007, I came up with a great solution but it really is taking quite a while to get going with this. Let's start with the background.

The column store argument is that if you separate out each column in a table into its own slice of data then this can be stored as bitmaps. These achieve very good compression ratios for each column. To answer a query you just link together all the individual columns required for the query and you've finished. So query cost is O(n) on the number of (compressed) columns in the query, rather than a row store for which data access is O(m) on the number of columns in the table. Column stores are great at doing things like count(*) but not very fast at retrieving data. So there are lots of queries that would show how good column stores can be, plus lots of examples of really bad queries as well. It all really depends upon what your exact query mix will be.

My main issue with column stores is how will you know ahead of time what your data warehouse query mix will be? If you are forced to pick one database product or another before you've even run a query then it seems a very risky decision. Would you base your data warehousing strategy on that? What we really want is a hybrid row/column datastore where you can have some data in row form and some data in column form.

Nobody seemed to have noticed that the column store data is exactly the same format as bitmap indexes on a single column. So, if we had a bitmap index on every column in a table we would be able to use exactly the same algorithm to extract data as column store databases do. The only implementation barriers would be that we don't yet have index-only scans, nor do we have the ability to have columns that are stored *only* in the index. This latter capability would mean that the heap row might actually hold only the visibility information and all other data would be held in indexes. I admit it sounds crazy, but it is exactly what column stores already do to extract data from the individual columns; we can do that also, it would be just an additional step in the executor to retrieve columns and assemble a result row.

Now the beauty of this approach is that you don't have to make choices between using a row and a column store database. You can choose, column-by-column if you wish, which columns will be "column-oriented" and which columns are "row-oriented", or you can have both.

So the things we need to make row/column hybrids work in PostgreSQL are

* Bitmap indexes
* Index-only scans
* Index-only columns

So core Postgres is at least one release away from that.

There are a couple of other thoughts alongside that as well. First is "join removal", now in 8.5, which is an optimizer technique for removing tables that we don't actually need to answer queries. I was very interested in that because it allows us to speed up queries by providing a mechanism for vertically partitioned tables - splitting a table into many tables each with a subset of columns. If you slice them up correctly then you avoid significant I/O when accessing tables when you only want a few of the columns. It was the first step in the right direction and I pursued it initially because it required only optimizer changes to make it work. I'll explain more about that on another post.

Last thought is about Greenplum and of course the new Single Node Edition. The latest version now has column-oriented storage as well as row oriented storage. So you have the option of choosing the design that will work best for your needs. It's a step towards where I'd like to see us go, plus GP is at least a year ahead of core Postgres in getting some form of column orientation. Initial indications are that it is faster also, though I've yet to do any objective performance tests. Been busy.

Tuesday, 5 January 2010

Extra Cost Option

Oracle's version of the Hot Standby concept is called Active Data Guard. You may know that Oracle11g was the first release of Oracle to have this functionality, so as of PostgreSQL 8.5, we will be on par. Probably worth adding that IBM DB2 doesn't have anything similar.

Someone just pointed out to me that Active Data Guard is an extra cost option with Oracle. Please point this out to your IT Manager!

Oracle suggests the following benefits for Active Data Guard
  • Increase performance—Offload unpredictable workloads to an up-to-date replica of the production database
  • Simplify operations—Eliminate management complexity that accompanies traditional replication solutions
  • Eliminate compromise—The reporting replica is up to date and online at all times, which is not possible with traditional storage mirroring technology
  • Reduce cost—An Oracle Active Data Guard physical standby database can also provide disaster recovery and/or serve as a test database—no additional storage or servers required
I think the first one definitely applies for PostgreSQL. The second and third depend upon what you're comparing against. I don't really understand the last claim: yes, the standby provides disaster recovery, but then so did Inactive(?) Data Guard/Warm Standby. And this idea about test databases seems a little strange, since the standby is read-only its not going to be a very useful test database.