Tuesday, 22 December 2009

Hot Standby TODO

Well, finally managed to get Hot Standby committed on Friday. The build farm was still Green on Saturday, so my efforts to keep the patch non-build specific seem to have paid off.

Quite a few bugs fixed in the days up to release and I ended up working 19 days straight on getting it into Postgres. Walking to the South Pole does seem like an easier challenge.

Couple of things already reported, so thanks to those vigilant community members. One a doc change and another a bug effecting idle sessions connected to a database that is being dropped. Strangely, we correctly handle the case where they are running SQL, just mess up when they aren't doing anything at all. It's minor, but still a few hours to fix.

There's been various discussions on Hackers about what features are needed next. For the most part, everybody wants their pet peeve of the moment to be a must-fix item. Given that I am actually human, I need to work through the tasks in a priority order and we need to get some balance about what that order should be.

I'll be hosting a few Hot Standby User Groups (HugS) to discuss in more detail the issues surrounding the feature and what people think they want. If you want to be a Hugger, please come along.

Plan is to have some HugS remotely, then do New York and FOSDEM in February. If you'd like to come along for some HugS please register at http://www.2ndQuadrant.com/. The first remote HugS is planned for January 6 at 1600UTC.

I hope you can come along to help adjust my priorities.

Thursday, 8 October 2009

Fried Eggs

When I was 24, I was asked to attend an emergency breakfast meeting to discuss the then current project. The Director had heard that the project was 1000 times slower than it needed to be. The Salesman's hands were shaking as I explained the performance results and their implications, in front of a group of people all more senior than myself. It was heresy, but once said, everything changed for the better, though I didn't know that as I spoke. I learned a lesson that day that I have never forgotten.

We try to work on a consensus basis on the PostgreSQL project. Some people assume that to mean that we must all agree. Others complain that this means everybody has a veto.

The committers are the eventual decision makers on the PostgreSQL project. Good decision making processes require collection of information prior to making a decision. Sometimes we don't have time for that, sometimes we do. "When" is always part of decision making.

Opinions must be openly disagreed with, if there is a clear and important reason and you are qualified to do so. That is the only way to give a decision maker all the information they need to take the right decision. Those decisions may require overruling an objector. Once a decision has been reached, it is best to quieten disagreement, since we don't have endless time and must move onto other matters. It is sometimes difficult to know when a decision maker speaks whether it is a decision, or an opinion. If we all (including, most importantly, a decision maker themselves) assume that every word spoken by a decision maker is a firm decision then that quickly leads to poor quality decisions. Consensus does not imply acquiescence, it requires both speaking and listening, in all directions.

Mahatma Gandhi said "Honest disagreement is often a good sign of progress."

If your disagreement is honest, then speak it clearly, politely and impersonally. Once decisions have been made, that disagreement should be put to one side, at least for this release. Honest disagreement is not a challenge to authority, it shows respect for the goals we all work towards.
Honest agreement is also important. If you agree, do not withhold that for partisan or political reasons.

All easy to say, much harder to do.

(The lesson? Never eat fried eggs while trying to say something important.)

Friday, 18 September 2009

Create Index CONCURRENTLY

I saw Depesz just came up against something on pgsql-bugs that I've done a couple of times.

CREATE INDEX CONCURRENTLY
ON tablename (column list);

Works perfectly. It adds an index called "concurrently" onto "tablename". That part is actually quite funny, and easily correctable using

ALTER INDEX concurrently RENAME TO something_sensible;

It's a very good lesson in why its a great idea to use tab completion, or a GUI. I have another story about that though, for another day.

Anyway, the really painful part about creating our index above is that the DBA (yours truly) intended to create an index concurrently, that is "concurrently" as an adverb rather than noun. The result is that you run a normal CREATE INDEX statement, so you end up applying a ShareLock to the table and all writes against the table stop. Of course, that happens right at the moment where you decide a cappuccino is a great plan and drive out for a round of coffee while the index build runs. Neat huh?

I figure I'm not the only person to forget to make mistakes and if we never admit them the software won't improve. So I'll see if we can come up with a little patch that avoids us adding "concurrently" as a reserved word. Not sure I see the problem really, since I never met anyone who named their database objects using adverbs rather than nouns. I bet there's one. There always is.

Anyway, the moral of the story is use

CREATE INDEX CONCURRENTLY indexname
ON tablename (column list);

Well actually, its not, but this is: Production systems are like lions - never turn your back on them, even after they've been fed, no matter how long you've been training them. And even lion tamers get bitten, sometimes. (And that is why I spend so long developing recovery features).

Thursday, 10 September 2009

Winning the Lottery

This week I'm the lucky recipient of about £23,750,000. That comes from various wills, lottery wins and corrupt magnates who can't cash a cheque any other way, except via me. So they say.

Last week I won more, I'm sure of it.

Did these guys ever need a database so they can deploy advanced marketing techniques!
(phone rings) .... ah, what's that? .... really? .... oh, how embarrassing, 3 you say? ....Maybe it needs tuning then, or perhaps replication? It does? Cool.... what's that? Could I just send a cheque for £100, so you can have my payment details correct and then you'll send through the consulting fees?.... Hmmm, maybe not after all. (hangs up)

Definitely hard to know which communications to receive and which to ignore. I figure if you read everything you'd never get anything done at all. I guess it's a balance and everybody chooses their own personal balance point.

Anyway, forget that, I'm off to spend my winnings: Woo hoo - So long, Lao Che!

Monday, 7 September 2009

Hot Standby

I'm currently working on the Hot Standby project, which kicked off design and development around July 2008 and was active through to February 2009. After a break to assist with the release of Postgres 8.4, the project began again in July 2009 as planned. It's definitely been a voyage of discovery.

Hot Standby seems to be a popular feature from two recent polls, so I guess that is why some people have been getting fairly anxious about the project recently. Anyway, what is it? Why is it important? Let me explain.

If you set up Log Shipping replication with PostgreSQL (called "Warm Standby" in the docs), you'll discover that the standby node can't run queries. All it does is keep processing changes so that it keeps mostly/nearly/eventually up to date with the Master - hence why its called a "warm" standby. Useful, but not enough, hence we move on to Hot Standby.

Why "Hot"? I've used the terms Cold/Warm/Hot similarly to the way they're used in other areas of High Availability computing. A Cold Standby/Spare is one that isn't available for immediate use and requires some work to get things configured. A Warm Standby/Spare is one that can be put into action fairly quickly, while a Hot Standby/Spare is one that can be used immediately without taking the system offline.

With Postgres, Hot Standby refers to the ability to

* Run read-only queries on a standby server that is being concurrently maintained by log shipping replication. We use full MVCC so read-only queries don't block incoming changes, nor do the changes lock out queries. There are some conflicts that can occur, discussed later.

* Switch from standby mode to normal/master-mode without stopping/restarting the server. So queries keep on running without interruption while the database server changes mode - hence it is "Hot". Many people are unaware of this second aspect of the feature.

The reasons for query conflicts are fairly detailed for a short blog, so I'll cover that in a later post. There'll be detailed docs with the patch when it's available around Sept 15.

The strange thing about Hot Standby is "it just works". When you run a few queries there's an initial buzz, but then you realise that it all works normally without much fuss. That's a shame really because this is probably the most technical patch to hit Postgres in years, involving many different parts of the internals. Seems fair though, after all, I admit I give scarcely no thought to all the people who spent time on the hardware and software that allows me to write this and for you to read it.

I'll be in Tokyo for the JPUG 10th Anniversary meeting later this year to present about Hot Standby. I'm very pleased to be attending and it would be a good place to learn more.

There's a practical course in London in late September if you want to work together while we cover the full details on Hot Standby and replication. Probably be more training dates and other events over the coming months also.

Tuesday, 1 September 2009

Database Biology

From Biology we learn that all (known) complex organisms are comprised of multiple cells. Cells are not simple things, but their most important quality is their ability to work together to create something much bigger than themselves.

Large, mature businesses end up with hundreds of databases to perform all of their business functions. Yet we seem still to have much difficulty in making databases work together. ETL, EAI and information integration tools seem to help somewhat but that feels more like plumbing than intelligent integration.

My goal for some years now has been to create ways for databases to work together in more complex ways, not just operate as single-celled organisms. The backbone of that is change propagation between the cells, which most people call database replication. That is why I've been so engaged in enhancing replication for so many years but it is not an end in itself. Sure, most of that has been for PostgreSQL - but then that's OK because everybody uses that, whether they even know/admit it or not.

Making it work is the first step. Making it all work easily is important also. With my background in solutions, I've personally preferred a toolkit approach that allows good custom solutions to be developed. Shrink-wrap software is needed as well and that's why I've been working with Robert Hodges of Continuent to help bring a compelling solution forwards that has both a good out-of-the-box feel as well as the flexibility to solve those tricky requirements people keep throwing at me. (Don't stop! It's fun)

Streaming, synchronous replication will be with us by next Summer if the release dates all go to plan. The ability to run queries on standby servers while they actively accept new changes from their master will also be with us soon: Hot Standby.

I'll blog more about Hot Standby for PostgreSQL over the coming weeks. In the longer term I'll explain more about the directions I hope to take with clustered databases, and why.