Oracle’s Parallel Query Option (PQO) is a fantastic tool. But much like any good tool, it can very easily be used in the wrong situation or simply abused. In those cases, PQO can actually make database performance much worse. Let’s examine some common misuses or misperceptions regarding optimal and efficient PQO usage.
Not really. Why use PQO on a small dual core server just because it’s available? And why use PQO on a big SMP server with lots of CPU’s, but maybe not as robust an IO subsystem – what advantage does that offer? And my favorite abuse, we have a 64 CPU server, so we’ll just define all important tables and indexes as parallel 64 (or even some lesser number significantly greater than 1). How many of those peoples’ servers will be running many concurrent queries anyhow? So why sub-divide the work and force sharing CPU’s across processes? We don’t drive the posted speed limit simply because we can if the weather makes that a questionable choice. So we should use the same discretion when planning PQO usage.
Right – and I have some wonderful “magic beans” to sell to people who blindly believe that theory. Oracle has to spawn additional shared server processes (or reuse those that are lying around still from recent completed PQO operations). So the cost of spawning additional processes and their footprints must be accounted for in the grand scheme of things. Plus, Oracle will need a “coordinator” process to coalesce all the results – and that too has a small cost. So calculate all this into your assumptions before using PQO. But for those of us less scientific, think of PQO as being like a grocery store checkout line. If you only have a couple items in your grocery cart, will forcibly spreading them across multiple checkout lanes get you out the any door sooner? And even if you do have an overly full cart, will spreading your items across checkout lanes that are already busy be any better? There are obvious times (i.e. rules) that are better than others. We simply need to apply the same clear thinking when using PQO.
OK – so now I also have the clear rights to sell the “Brooklyn Bridge” to people who may think this. Oracle and parallel queries are much more complicated than this. For example an ORDER BY or GROUP BY will double the number of non-coordinator processes required. So we have at least two times N now. But then there are “init.ora” parameters to limit or govern PQO usage. So some queries might get the full PQO amount, some might execute linearly, while others may use <= N processes due to such configuration parameters. The only way to know for sure is to monitor it live.
OK – I’m 100% guilty of doing this on occasion. In fact, I might even do this more often than not. But I am cognoscente of the above issues and have planned according – and still have a reasonable PQO scenario where my IO is 100% not my chief bottleneck. Thus in those clear cases I will extend beyond the actual CPU count in my PQO settings. I’ve even gone two to four times higher in the right circumstances. And in those cases it has made a clear and measureable difference. But be careful of when it does not make sense. For an easy test to verify this, simply set PQO to degree four on your dual core desktop or laptop and run several concurrent queries. It will not only run slower, but painfully so.
As silly as this may sound at first, I’ve seen this in many data warehousing scenarios. I think it happens because somewhere someone wrote that “parallel full-table scans were efficient”. But that author meant in clearly reasonable scenarios – and not universally. Yet this remains one of the top data warehousing tuning mistakes I see. Think about a DW query – it may scan millions of rows in tables with many billions of rows. Why force a full-table scan when there is at least a ten-to-one, if not a one-hundred-to-one, reduction factor. That’s exactly what indexes are for – and index scans can be done in parallel. I’ve seen improvements as large as 1000% from undoing this simple PQO misuse.
Informative post written in a lighthearted way. IMHO, Auto DOP in 11gR2 is a controversial change to the behavior of the Optimizer. One reason is that a parameter setting puts a ceiling on the number of parallel sessions. This limitation can be silently encountered. How are DBAs using parallel processing?