淘先锋技术网

首页 1 2 3 4 5 6 7
现象描述:
在DB1数据库的lgwr日志里发现lgwr写入小量数据(1kb或几十KB)却延时严重(500ms以上)
问题:
为什么redo log写入小量数却延时严重(500ms以上)?
结论:
初步判断使用veritas的库文件(含ODM)版本过低造成,会令lgwr写日志时挂起3秒,oracle技术文档建议升级到5.1。
行动:
1. 根据文档确认我们使用的veritas 库文件版本及是否启用? How To Validate/Verify ODM (Oracle Disk Manager) Is Enabled. (文档 ID 1352242.1)
2. 在support oracle开设SR追问关于此问题的诊断及解决方案。
影响:
在DB1上会出现由于log file sync而导致的会话阻塞,系统会感到缓慢。
具体分析如下(LGWR日志):
*** 2017-05-08 23:09:12.984
Warning: log write elapsed time 513ms, size 1KB

*** 2017-05-08 23:12:14.490
Warning: log write elapsed time 517ms, size 1KB

*** 2017-05-08 23:15:15.936
Warning: log write elapsed time 516ms, size 1KB

*** 2017-05-08 23:24:20.392
Warning: log write elapsed time 525ms, size 1KB

*** 2017-05-08 23:33:24.895
Warning: log write elapsed time 545ms, size 1KB

AWR日志:
Avg
%Time Total Wait wait Wait % DB
Event Waits -outs Time (s) (ms) /txn time
--------------- ------------ ----- - --------- ------- -------- ------
log file sync 10 0 26 2550 0.0 1.3


SUPPORT ORACLE:

Intermittent Long 'log file sync' Waits on Veritas ODM, LGWR Posting Long Write Times, I/O Portion of Wait Minimal (文档 ID 1278149.1) 转到底部

________________________________________
In this Document
Symptoms

Changes

Cause

Solution

References

________________________________________

This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review.
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.1 and later
Information in this document applies to any platform.
SYMPTOMS
• The LGWR appears to be encountering regular problems where it hangs for 3 seconds at a time.
During this time, an iostat -x on the redo devices shows zero IO being pushed through.
• Veritas ODM (Oracle Disk Manager) is being used
• LGWR trace shows numerous entries such as:
...
*** 2010-11-01 07:33:14.745
Warning: log write elapsed time 3203ms, size 1KB
(set event 10468 level 4 to disable this warning)
*** 2010-11-01 07:52:57.813
Warning: log write elapsed time 3345ms, size 3KB
(set event 10468 level 4 to disable this warning)
...
• The average wait times for log file sync do not typically appear to be made up of IO waits (there is no evidence of 3 second IOs):
NAME TOTAL_WAITS TOTAL_WAIT_TIME AVG_TIME_WAITED
----------------------- ----------- ----------------- -----------------
log file sync 84 12684.472 151.006
log file parallel write 17655 28848.331 1.634
The affect on any session in the database at this point is to wait with a log file sync event, effectively stalling the system for 3 seconds.
• When dtrace was run to trace the issue, the problem went away. Problem came back as soon as dtrace was stopped.
• Truss on LGWR also avoids the issue.
• At the times when LGWR suffers the slow write, the process is given a priority of 0. So it looks like the operating system is scheduling the LGWR process off CPU and it takes 3 seconds to get back on and catch up.
Dtrace looks to resolve the issue by either keeping the lgwr process on CPU or ensuring if it does come off CPU, it is re-instated immediately.
• System is NOT under load when this happens. It even happens more during the quiet periods.
• Renice the LDWR lgwr process makes no difference.
• From 10298 trace, the modules immediately prior to the waits are not the same and are sometimes IO related and are sometimes not which indicates that the delay is not necessarily down to a wait for I/O completion.
CHANGES
none
CAUSE
It appears that it is possible that an issue in Veritas ODM can cause such an issue.
An upgrade in veritas library from 5.0 to 5.1 (which includes ODM), resolved the LGWR issue.

A bug was created related to the Oracle side of this issue:
Document 10318123.8 - lgwr regularly stalls for 3 seconds at a time
SOLUTION
Run with veritas library 5.1 (which includes ODM) or check that all the related IO subsystem is at the latest patch revisions.


How To Validate/Verify ODM (Oracle Disk Manager) Is Enabled. (文档 ID 1352242.1) 转到底部

________________________________________
In this Document
Goal

Solution

Community Discussions

References

________________________________________
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.2 [Release 10.2 to 11.2]
Information in this document applies to any platform.
GOAL
The present document provides an example about how to validate and verify if ODM (Oracle Disk Manager) is enabled on your environment.

ODM architecture

When the Veritas ODM library is linked, Oracle is able to bypass all caching and locks at the file system layer and to communicate directly with raw volumes. The SF Oracle RAC implementation of ODM generates performance equivalent to performance with raw devices while the storage uses easy-to-manage file systems.

All ODM features can operate in a cluster environment. Nodes communicate with each other before performing any operation that could potentially affect another node. For example, before creating a new data file with a specific name, ODM checks with other nodes to see if the file name is already in use.

Veritas ODM performance enhancements

Veritas ODM enables performance benefits provided by Oracle Disk Manager:

â¢Locking for data integrity.
â¢Few system calls and context switches.
â¢Increased I/O parallelism.
â¢Efficient file creation and disk allocation.

Databases using file systems typically incur additional overhead:

â¢Extra CPU and memory usage to read data from underlying disks to the file system cache. This scenario requires copying data from the file system cache to the Oracle cache.
â¢File locking that allows for only a single writer at a time. Allowing Oracle to perform locking allows for finer granularity of locking at the row level.
â¢File systems generally go through a standard Sync I/O library when performing I/O. Oracle can make use of Kernel Async I/O libraries (KAIO) with raw devices to improve performance.

ODM communication - Port d
ODM uses port d to communicate with other ODM instances to support the file management features of Oracle Managed Files (OMF). OMF enables DBAs to set database parameters, such as the init.ora parameters for db datafile, controlfile, and logfile names, and for those structures to be named automatically. OMF allows for the automatic deletion of physical data files when DBAs remove tablespaces.

References:

https://sort.symantec.com/public/documents/sf/5.0/aix/html/sf_rac_install/sfrac_intro10.html
SOLUTION
A) Please verify that Veritas ODM is correctly enabled and configured as the following exampe (your ODM version & Oracle version could be different):


1. Check the VRTSdbed license:

For Database Edition 3.0:
# /usr/sbin/vxlicense -t DATABASE_EDITION| egrep \ 'Feature name:|Expiration date:'
vrts:vxlicense: INFO: Feature name: DATABASE_EDITION [100]
vrts:vxlicense: INFO: Expiration date: No expiration date



For Database Edition 3.5:
# /sbin/vxlictest -n "VERITAS Database Edition for Oracle" -f "ODM"
ODM feature is licensed

2. Check that the VRTSodm package is installed:

For Database Edition 3.0 or Database Edition 3.5:
# pkginfo VRTSodm
system VRTSodm VERITAS Oracle Disk Manager

3. Determine whether the Oracle version is 32-bit or 64-bit:
# cd $ORACLE_HOME/bin
# file oracle

For a 32-bit version, the output is as follows:
oracle: ELF 32-bit MSB executable SPARC Version 1, dynamically linked, not stripped
For a 64-bit version, the output is as follows:
oracle: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped
4 . Check that libodm.so is present and correctly linked:

For Database Edition 3.0 or Database Edition 3.5:

If running 32-bit Oracle, use the following command:
#ls -lL /usr/lib/libodm.so
-rw-r--r-- 1 root sys 13288 Apr 25 18:42 /usr/lib/libodm.so

# cmp $ORACLE_HOME/lib/libodm9.so /usr/lib/libodm.so
# echo $?
0

Alternatively, if you are running 64-bit Oracle, use the following command:
# ls -lL /usr/lib/sparcv9/libodm.so
-rw-r--r-- 1 root sys 16936 Apr 25 18:42 /usr/lib/sparcv9/libodm.so

# cmp $ORACLE_HOME/lib/libodm9.so /usr/lib/libodm.so
# echo $?
0

5. If the above link test fails, then check that the following links have been created correctly:

If running 32-bit Oracle, use the following command:
# ln -s /opt/VRTSodm/lib/libodm.so $ORACLE_HOME/lib/libodm9.so

Alternatively, if you are running 64-bit Oracle, use the following command:
# ln -s /opt/VRTSodm/lib/sparcv9/libodm.so $ORACLE_HOME/lib/libodm9.so

6. Check that the instance is using the Oracle Disk Manager function:
# cat /dev/odm/stats >/dev/null
# echo $?
0

B) For ODM configuration with Oracle Server 11.2 and DNFS please check the next manual:
http://www.oracle.com/pls/db112/homepage
=)> Oracle® Grid Infrastructure Installation Guide
11g Release 2 (11.2) for HP-UX
Part Number E17211-09
==)> 3 Configuring Storage for Grid Infrastructure for a Cluster and Oracle Real Application Clusters (Oracle RAC)
===)> 3.2.8 Enabling Direct NFS Client Oracle Disk Manager Control of NFS