Tuning Redo Log Sync Wait Events

So if you ask very specific and obscure technical questions and you’re not getting answers or you’re getting “don’t know” answers then you might want to figure out that the interviewee is not a walking Oracle manual – if you do come across someone who has memorized all the thousands and thousands of pages of Oracle manuals, then you might want to wonder if that person can remember their own name.

So now I have some more Oracle interview questions that I could not answer, after almost 30 years of experience I must confess I don’t know everything. I am interviewing someone who is clearly experienced, but I am not getting answers I might think that I am asking the wrong questions – it’s not unfair but moreover it means I am either not going to hire someone I probably should, or I am trying to obstruct the possible hire of a potential competitor to myself.

What causes high CPU and how can t be resolved? What is this a bad question to ask? Because it has about 101 different answers. The most likely answer is because of i/o problems (again a myriad of causes), but I recently was asked this and it turned out it was because of log writer sync wait events. The trouble with those is that they are normal, and the only things you can do are build out the redo log files properly (should be done on installation anyway as Oracle defaults to 3 sorely inadequately sized redo logs). So if have not moved away from the default installation model fix that first. If you are tempted to changed the redo log buffer – don’t – that’s just tampering. Most DBAs with experience will not touch the log buffer. So you’ve not fiddled with the log buffer and you’ve set up the redo logs properly – so what else? Is memory for other parts of the database configured correctly because if it’s not (Oracle internal parameters and kernel parameters) or something is wrong in the o/s that causes memory problems, then perhaps there will be swapping or constant process switching issues related to the log writer buffer and process, but the point is it’s something else causing the problem – fiddling with the log buffer will not solve your problem because the problem is elsewhere. What does cause log buffer waits from within oracle? Your application has too many commits because the log buffer will write to disk with every commit operation, even if the log buffer has a single block in it (log buffer auto writes to disk when 1/3 full as well). Again, your log files can be moved to faster disks but once again look at redo log file architecture first – the easiest way.  Then go talk to your developers about too many commits in coding.

So now go and look at Tuning log file syn wait events if you still don’t believe me and if you read the lists of causes (the things you should focus on to actually solve the problem, you will see: disk i/o, busy server, memory issues and paging, buffer cache (memory issues inside Oracle or o/s kernel parameters or both), too many coded COMMITs, writing to archives or writing out log buffer to disk (architect redo logs and archive logs properly), latches and locks. In short it’s all NOT the log buffer. Fix the problem not the symptom.

So if you’re spending hours poring over operating system utilities and logs, then you’re probably wasting your time. If you changed the log writer buffer size (from its default), then seriously considering changing it back, because if you have not solved the problem it’s because you addressed the symptom and not the problem. Also bear in mind that not all problems are soluble and sometimes they are not even a problem. And if someone says “we have latch problems” then they either have configured memory incorrectly (kernel and Oracle internals, and have they moved from HP-UX or Windows to Linux).

Use Google for there is always another who has had the same problem! And remember that not all problems indicate an issue that is directly related to that issue, and that Oracle wait events always indicate a symptom and not an actual cause of a problem.