January 07, 2005

Connection Pools - are they always a good thing?

Sorry about the hiatus, it was mainly because I had nothing that I wanted to say... I suspect that that is going to change.

I had a talk with a friend (hi Meeraj!) yesterday and we were discussing a connection pool that he was using. After certain queries anything up to 100 megabytes were consumed and not garbage collected. The theory that we arrived at was that the PreparedStatement caching was the problem. I don't know if that was the case, I hope that Meeraj will let me know, but that set me to thinking.

Why did we introduce the complexity of a connection pool to our code in the first place?

I don't just mean the complexity of writing them (that's mainly dealt with by DB vendor's own implementations nowadays); tuning them and ensuring the robustness of our own code around them can be an headache.

As I recall the major reason for pooling connections was the relative expense of creating the connection in the first place. This expense was only large compared to the time that a small query would take to run and be used. It made a lot of sense to reuse a single connection for several of these little queries. The longer running the queries the smaller the amount of time (proportionately) was spent in creating the connection.

So, for a system in which queries are long running and the connections tend to consume a lot of resources, does it make sense to have a connection pool? Would it be better to have some more simple form of connection management?

The 'meta' lesson is that even for the most frequently used patterns / components, we should always remember why they were used in the first place and recognise the situations where they add no value or perhaps even cause problems.

No comments: