| Oracle user alert | User alert |
| Long running SQL | A SQL has been running for too long |
| Data tablespace free space | Percentage of free space in a tablespace |
| Temp tablespace free space | Percentage of free space in a temporary tablespace |
| Undo tablespace free space | Percentage of free space in an undo tablespace |
| No temp space allocated | No temp space is allocated for the database. Some operations requiring temp
space will fail |
| Unarchived logs | Percentage of unarchived online redo logs |
| PGA - multipass executions | Under the one-pass threshold, when the size of a work area is far too small compared to the input data
size multiple passes over the input data are needed This could dramatically increase the response time of the operator.
This is known as the multi-pass size of the work area For example, a serial sort operation that needs
to sort 10GB of data needs a little more than 10GB to run optimal and at least 40MB to run one-pass If this sort gets less
that 40MB, then it must perform several passes over the input data |
| PGA - one-pass executions | When the size of the work area is smaller than optimal, the response time increases, because an extra
pass is performed over part of the input data. This is known as the one-pass size of the work area.
The goal is to have most work areas running with an optimal size (for example, more than 90% or even 100% for pure OLTP
systems), while a smaller fraction of them are running with a one-pass size (for example, less than 10%) |
| Buffer cache (2K) hit ratio | Buffer cache hit ratio shows how frequently data blocks are accessed from the memory (buffer cache)
rather than from disk. A hit ratio of 95% or greater is considered to be a good hit ratio for
OLTP systems. The hit ratio for DSS (Decision Support System) may vary depending on the database load.
A correctly tuned buffer cache can significantly improve overall database performance. By itself however, the
buffer cache hit ratio is not very meaningful, and the data buffer cache hit ratio is largely meaningless for
decision support and data warehouse applications because of their propensity to have full-table scans and parallel
full-table scans (which may bypass the data buffers entirely, using PGA memory).
|
| Buffer cache (4K) hit ratio | Buffer cache hit ratio shows how frequently data blocks are accessed from the memory (buffer cache)
rather than from disk. A hit ratio of 95% or greater is considered to be a good hit ratio for
OLTP systems. The hit ratio for DSS (Decision Support System) may vary depending on the database load.
A correctly tuned buffer cache can significantly improve overall database performance. By itself however, the
buffer cache hit ratio is not very meaningful, and the data buffer cache hit ratio is largely meaningless for
decision support and data warehouse applications because of their propensity to have full-table scans and parallel
full-table scans (which may bypass the data buffers entirely, using PGA memory).
|
| Buffer cache (8K) hit ratio | Buffer cache hit ratio shows how frequently data blocks are accessed from the memory (buffer cache)
rather than from disk. A hit ratio of 95% or greater is considered to be a good hit ratio for
OLTP systems. The hit ratio for DSS (Decision Support System) may vary depending on the database load.
A correctly tuned buffer cache can significantly improve overall database performance. By itself however, the
buffer cache hit ratio is not very meaningful, and the data buffer cache hit ratio is largely meaningless for
decision support and data warehouse applications because of their propensity to have full-table scans and parallel
full-table scans (which may bypass the data buffers entirely, using PGA memory).
|
| Buffer cache (16K) hit ratio | Buffer cache hit ratio shows how frequently data blocks are accessed from the memory (buffer cache)
rather than from disk. A hit ratio of 95% or greater is considered to be a good hit ratio for
OLTP systems. The hit ratio for DSS (Decision Support System) may vary depending on the database load.
A correctly tuned buffer cache can significantly improve overall database performance. By itself however, the
buffer cache hit ratio is not very meaningful, and the data buffer cache hit ratio is largely meaningless for
decision support and data warehouse applications because of their propensity to have full-table scans and parallel
full-table scans (which may bypass the data buffers entirely, using PGA memory).
|
| Buffer cache (32K) hit ratio | Buffer cache hit ratio shows how frequently data blocks are accessed from the memory (buffer cache)
rather than from disk. A hit ratio of 95% or greater is considered to be a good hit ratio for
OLTP systems. The hit ratio for DSS (Decision Support System) may vary depending on the database load.
A correctly tuned buffer cache can significantly improve overall database performance. By itself however, the
buffer cache hit ratio is not very meaningful, and the data buffer cache hit ratio is largely meaningless for
decision support and data warehouse applications because of their propensity to have full-table scans and parallel
full-table scans (which may bypass the data buffers entirely, using PGA memory).
|
| Keep pool hit ratio | The KEEP pool is where you place the objects you want to keep cached in memory. After a short
warmup period the KEEP pool hit ratio should be as close to 100% as possible
|
| Recycle pool hit ratio | The RECYCLE pool is where you place the objects you never want to keep cached in memory.
The cache hit ratio for the RECYCLE pool should be as close to 0% as possible (i.e. all access to the
RECYCLE pool should result in a physical read)
|
| Dictionary cache hit ratio | Misses on the data dictionary cache are to be expected in some cases. On instance startup,
the data dictionary cache contains no data. Therefore, any SQL statement issued is likely to result
in cache misses. As more data is read into the cache, the likelihood of cache misses decreases.
Eventually, the database reaches a steady state, in which the most frequently used dictionary data
is in the cache. At this point, very few cache misses occur. Typically, if the shared pool is adequately
sized for the library cache, it will also be adequate for the dictionary cache data |
| Redo buffer allocation retries | The value of redo buffer allocation retries should be near zero over an interval.
If this value increments consistently, then processes have had to wait for space in the redo log buffer. The wait can be
caused by the log buffer being too small or by checkpointing. Increase the size of the redo log buffer, if necessary, by
changing the value of the initialization parameter LOG_BUFFER. The value of this parameter is expressed in bytes.
Alternatively, improve the checkpointing or archiving process. |
| Sorts to disk ratio | Sorts on disk are sorts that could not be performed in memory, therefore they are more expensive
because memory access is faster than disk access. When Oracle writes sort operations to disk, it writes out partially
sorted data in sorted runs. After all the data has been received by the sort, Oracle merges the runs to produce the final
sorted output. If the sort area is not large enough to merge all the runs at once, then subsets of the runs are merged in
several merge passes. If the sort area is larger, then there are fewer, longer runs produced. A larger sort area also means
that the sort can merge more runs in one merge pass |
| Soft parse ratio | Soft parse ratio: This shows whether there are many hard parses on the system. The ratio should be
compared to the raw statistics to ensure accuracy. For example, a soft parse ratio of 80% typically indicates a high hard
parse rate. However, if the total number of parses is low, then the ratio should be disregarded |
| Parse to execute ratio | In an operational environment, optimally a SQL statement should be parsed once and executed
many times. This alert could indicate poor cursor reuse. Find out what specific SQL statements have a parse count
that is equal to the execute count. These statements are contributing to ineffective cursor sharing |
| Archive destination error | Oracle reports an error archiving to one of the archive destinations |
| Archive destination error | Archive destination configuration error |
| Response time | The time it takes for the Oracle server to respond to a simple query is too high.
The server may be overloaded |
| Timed Statistics | Timed statistics is not enabled on the database. Statistics information is not available for this
database. To enable timed statistics set STATISTICS_LEVEL initialization parameter to TYPICAL or ALL |
| Buffer cache advice | Buffer cache advice is not enabled on the database. Statistics information is not available
for this database. To enable buffer cache advisor set DB_CACHE_ADVICE initialization parameter to ON |
| PGA Advice | PGA Advice is not enabled on the database. PGA Advice is not available if PGA_AGGREGATE_TARGET
is not set. In addition, PGA Advice is not updated if the STATISTICS_LEVEL parameter is set to BASIC. To enable
PGA Advice set STATISTICS_LEVEL initialization parameter to TYPICAL or ALL |
| Shared pool advice | Shared Pool Advice is not enabled on the database. Statistics information is not available for
this database. To enable timed statistics set STATISTICS_LEVEL initialization parameter to TYPICAL or ALL |
| Physical standby out of sync | A standby database is out of sync. It may not be receiving redo logs from the primary database
or is too busy to process in a timely manner |
| Physical standby not applying logs | A standby database is receiving redo logs from the primary database but is not applying them in a timely manner.
It may be too busy or the log apply process is down |
| Archived log gap detected | An archive gap can occur on the standby system when it is has not received one or more archived redo log files
generated by the primary database. The missing archived redo log files are the gap. If there is a gap, it is automatically
detected and resolved by Data Guard by copying the missing sequence of log files to the standby destination. For example,
an archive gap can occur when the network becomes unavailable and automatic archiving from the primary database to the
standby database temporarily stops. When the network is available again, automatic transmission of the redo data from the
primary database to the failed standby database resumes. Data Guard requires no manual intervention by the DBA to detect
and resolve such gaps. In some situations, automatic gap recovery may not take place and you will need to perform gap
recovery manually. For example, you will need to perform gap recovery manually if you are using logical standby databases
and the primary database is not available |
| Number of waiting sessions | The number of sessions waiting to proceed is high. This could be due to resource contention |
| Log file sync | When a user session commits (or rolls back), the session's redo information must be flushed to
the redo logfile by LGWR. The server process performing the COMMIT or ROLLBACK waits under this event for the write
to the redo log to complete. If this event's waits constitute a significant wait on the system or a significant amount
of time waited by a user experiencing response time issues or on a system, then examine the average time waited. If
the average time waited is low, but the number of waits are high, then the application might be committing after every
INSERT, rather than batching COMMITs. Applications can reduce the wait by committing after 50 rows, rather than every row.
If the average time waited is high, then examine the session waits for the log writer and see what it is spending most
of its time doing and waiting for. If the waits are because of slow I/O, then try the following: -
Reduce other
I/O activity on the disks containing the redo logs, or use dedicated disks- Alternate
redo logs on different
disks to minimize the effect of the archiver on the log writer- Move the redo logs to faster disks or a faster
I/O subsystem- Consider using raw devices to speed up the writes-
Depending on the type of application,
it might be possible to batch COMMITs by committing every N rows, rather than every row, so that fewer log file syncs
are needed |
| Log buffer space | This event occurs when server processes are waiting for free space in the log buffer, because all
the redo is generated faster than LGWR can write it out.Modify the redo log buffer size. If the size of the log buffer
is already reasonable, then ensure that the disks on which the online redo logs reside do not suffer from I/O contention.
The log buffer space wait event could be indicative of either disk I/O contention on the disks where the redo logs reside,
or of a too-small log buffer. Check the I/O profile of the disks containing the redo logs to investigate whether the I/O
system is the bottleneck. If the I/O system is not a problem, then the redo log buffer could be too small. Increase the
size of the redo log buffer until this event is no longer significant |
| Log file parallel write | The log file parallel write alert is triggered when the sessions
are waiting for log writer to write redo from log buffer to all the members of the redo log group.
LGWR writes to the active log file members in parallel only if the asynchronous I/O is in use.
Otherwise it writes to each active log file member sequentially. If the sessions are waiting on
this event you may want to put redo log files on faster devices or investigate if there is a contention
on the redo log drives |
| Log file switch (archiving needed) | The log file switch (archiving needed) alert is triggered when
sessions are waiting for a log switch because the log that the LGWR will be switching into has not been
archived yet. Check the alert file to make sure that archiving has not stopped due to a failed archive
write. To speed archiving, consider adding more archive processes or putting the archive files on
striped disks |
| Log file switch (checkpoint incomplete) | The log file switch (checkpoint incomplete) alert is triggered when
sessions are waiting for a log switch because the sessions cannot wrap into the next log. Wrapping
cannot be performed because the checkpoint for that log has not completed. You may see this alert when
the redo log files are sized too small |
| Write complete | The write complete waits occur because Oracle cannot allow
blocks that are about to be written to disk by DBWR to be modified, lest an inconsistent block image
be written to disk. If write complete waits constitute a significant part of the overall waits then
an inefficiency in the I/O subsystem could be the cause. Possible solutions are:
- Spread datafiles across multiple disks
- Use multiple database writers
- Turn on asynchronous I/O |
| Buffer busy waits | The buffer busy wait alert is fired when a session wants to
access a data block in the buffer cache that is currently in use by some other session. The other
session is either reading the same data block into the buffer cache or it is modifying the one in
the buffer cache. It might be that many processes are inserting into the same block and must wait
for each other before they can insert. The solution could be to use automatic segment space management
or partitioning for the object in question. Also look at the specific buffer wait statistics available
in the view V$WAITSTAT to see which block type has the highest wait count and the highest wait time.
The possible actions might be:
If the predominant buffer waits are for:
Data blocks - Eliminate HOT blocks from the application. Check for repeatedly scanned / unselective indexes.
Change PCTFREE and/or PCTUSED. Check for 'right- hand-indexes' (indexes that get inserted into at the same
point by many processes). Increase INITRANS. Reduce the number of rows per block.
Segment header - Increase of number of FREELISTs. Use FREELIST GROUPs (even in single instance this can make
a difference).
Freelist blocks - Add more FREELISTS. In case of Parallel Server make sure that each
instance has its own FREELIST GROUP(s).
Undo header - If you are not using automatic undo management, then add more rollback segments or increase their size
You can select also from V$SEGSTAT (or its user-friendly but costly equivalent V$SEGMENT_STATISTICS)
the top segments that cause the buffer busy waits |
| Free buffer waits | This alert indicates that a server process was unable to find a free buffer and has posted the
database writer to make free buffers by writing out dirty buffers. A dirty buffer is a buffer whose contents have been
modified. Dirty buffers are freed for reuse when DBWR has written the blocks to disk. DBWR may not be keeping up with
writing dirty buffers in the following situations:
- The I/O system is slow.There are resources it is waiting for, such as latches.
- The buffer cache is so small that DBWR spends most of its time cleaning out buffers for server processes
- The buffer cache is so big that one DBWR process is not enough to free enough buffers in the cache to
satisfy requests
If this event occurs frequently, then examine the session waits for DBWR to see whether there is anything delaying DBWR.
Writes
If it is waiting for writes, then determine what is delaying the writes and fix it. Check the following:
Examine V$FILESTAT to see where most of the writes are happening
Examine the host operating system statistics for the I/O system. Are the write times acceptable?
If I/O is slow:
Consider using faster I/O alternatives to speed up write times
Spread the I/O activity across large number of spindles (disks) and controllers
Cache is Too Small
It is possible DBWR is very active because the cache is too small. Investigate whether this is a probable cause by
looking to see if the buffer cache hit ratio is low. Also use the V$DB_CACHE_ADVICE view to determine whether a larger
cache size would be advantageous
Cache Is Too Big for One DBWR
If the cache size is adequate and the I/O is already evenly spread, then you can potentially modify the behavior
of DBWR by using asynchronous I/O or by using multiple database writers
Consider Multiple Database Writer (DBWR) Processes or I/O Slaves
Configuring multiple database writer processes, or using I/O slaves, is useful when the transaction rates are
high or when the buffer cache size is so large that a single DBWn process cannot keep up with the load |
| Lock waits | The lock wait alert is triggered when the sessions
spend signigicant time waiting on locks; these can be either explicit user locks or implicit locks
Oracle puts on objects during their modifications like e.g. locks on rows or index entries during
inserts, updates or deletes |
| Cache buffers LRU chain latch waits | The cache buffers LRU chain alert is usually triggered by
excessive buffer cache throughput. For example, inefficient SQL that accesses incorrect indexes
iteratively (large index range scans) or many full table scans. The cache buffer LRU chain latch must
be obtained in order to introduce a new block into the buffer cache, and when writing a buffer back
to disk. It is possible to reduce contention for the cache buffer lru chain latch by increasing the
size of the buffer cache and thereby reducing the rate at which new blocks are introduced into the
buffer cache. |
| Cache buffers chains latch waits | The cache buffers chains alert is usually triggered by
contention in the buffer cache. Reducing contention for the cache buffer chains latch will usually
require reducing logical I/O rates by tuning and minimizing the I/O requirements of the SQL involved.
High I/O rates could be a sign of a hot block (meaning a block highly accessed) |
| Redo allocation latch waits | The redo allocation latch controls the allocation of space
for redo entries in the redo log buffer. There is one redo allocation latch per instance. You need to
consider to increase the size of the LOG_BUFFER or reduce the load of the log buffer using NOLOGGING
features when possible |
| Redo copy latch waits | The redo copy latch is used to write redo records into the
redolog buffer. This latch is waited for on both single and multi-cpu systems |
| Library cache latch waits | The library cache latches protect the cached SQL statements and
objects definitions held in the library cache within the shared pool. The library cache latch must be
acquired in order to add a new statement to the library cache. During a parse, Oracle searches the library
cache for a matching statement. If one is not found, then Oracle will parse the SQL statement, obtain the
library cache latch and insert the new SQL
The first resource to reduce contention on this latch is to
ensure that the application is reusing as much as possible SQL statement representation. Use bind
variables whenever possible in the application. Misses on this latch may also be a sign that the
application is parsing SQL at a high rate and may be suffering from too much parse CPU overhead.If
the application is already tuned the SHARED_POOL_SIZE can be increased. Be aware that if the application
is not using the library cache appropriately, the contention might be worse with a larger structure to
be handled |
| Library cache pin latch waits | The library cache pin latch must be acquired when a
statement in the library cache is reexecuted. Misses on this latch occur when there is very
high rates SQL execution. There is little that can be done to reduce the load on the library
cache pin latch, although using private rather than public synonyms or direct object references
such as OWNER.TABLE may help. |
| Shared pool latch waits | The shared pool latch is used to protect critical
operations when allocating and freeing memory in the shared pool. If an application makes
use of literal (unshared) SQL then this can severely limit scalability and throughput.
The cost of parsing a new SQL statement is expensive both in terms of CPU requirements and
the number of times the library cache and shared pool latches may need to be acquired and
released. Before Oracle9, there use to be just one such latch to the entire database to
protects the allocation of memory in the library cache. In Oracle9 multiple childs were
introduced to relieve contention on this resource
Ways to reduce the shared pool latch are,
avoid hard parses when possible, parse once, execute many. Eliminating literal SQL is also
useful to avoid the shared pool latch. The size of the shared_pool and use of MTS
(shared server option) also greatly influences the shared pool latch |
| Row cache objects latch waits | The row cache objects latch comes into play when user
processes are attempting to access the cached data dictionary values. It is not common to have
contention in this latch and the only way to reduce contention for this latch is by increasing
the size of the shared pool |
| Chained row fetches | The chained rows alert is triggered when the percentage of chained or migrated
rows fetched exceeds a threshold. Retrieving rows that span more than one block increases the logical I/O by a factor
that corresponds to the number of blocks than need to be accessed. Exporting and re-importing or reorganizing the table
with the dbms_redefinition utility may eliminate this problem.
Evaluate the settings for the storage parameters PCTFREE and PCTUSED. This problem cannot be fixed if rows are larger
than database blocks (for example, if the LONG datatype is used and the rows are extremely large) |
| Session high physical reads | The session high physical reads alert is triggered when the database
is performing a lot of physical reads and a small percentage of active sessions is responsible for a significant
percentage of the total physical reads |
| Session high physical writes | The session high physical writes alert is triggered when the database
is performing a lot of physical writes and a small percentage of active sessions is responsible for a significant
percentage of the total physical writes |
| Session high logical reads | The session high logical reads alert is triggered when the database
is performing a lot of logical reads and a small percentage of active sessions is responsible for a significant
percentage of the total logical reads |
| Session high network load | The session high network load alert is triggered when a lot of data is
being sent to/from the database and a small percentage of active sessions is responsible for a significant
percentage of the total network traffic |
| Session high redo generation | The session high redo generation alert is triggered when a lot of redo
is being generated and a small percentage of active sessions is responsible for a significant
percentage of the total redo |
| Session high memory consumption | The session high memory consumption alert is triggered when a lot of memory
is being used and a small percentage of active sessions is responsible for a significant
percentage of the total amount of used memory |
| Session high CPU consumption | The session high CPU consumption alert is triggered when a lot of CPU resources
is being used and a small percentage of active sessions is responsible for a significant
percentage of the total used CPU time |
| Session high sorts | The session high sort alert is triggered when a lot of sorts
is being performed and a small percentage of active sessions is responsible for a significant
percentage of the total number of sorts |
| Library cache reloads | Previously cached SQL statements were aged out of the library cache. This could indicate that
shared pool is too small. In an application that reuses SQL effectively, on a system with an optimal shared
pool size, the number of reloads will have a value near zero |
| Library cache invalidations | Library cache data was invalidated and had to be reparsed. Invalidations should be near zero.
This means SQL statements that could have been shared were invalidated by some operation (for example, a DDL).
This statistic should be near zero on OLTP systems during peak loads. Consult V$LIBRARYCACHE
view for details |
| Parallel servers available | Parallel execution performs SQL operations in parallel using multiple parallel processes.
One process, known as the parallel execution coordinator, dispatches the execution of a statement to
several parallel execution servers and coordinates the results from all of the server processes to
send the results back to the user.
PARALLEL_MAX_SERVERS initialization parameter specifies the maximum number of parallel execution
processes and parallel
recovery processes for an instance. As demand increases, Oracle increases the number of processes
from the number created at instance startup up to this value.
The Parallel Server alert is triggered when all or most parallel
servers are busy. Try increasing PARALLEL_MAX_SERVERS initialization parameter to make more parallel
servers available.
If you set this parameter too low, some queries may not have a parallel execution process available to
them during query processing. If you set it too high, memory resource shortages may occur during peak
periods, which can degrade performance.
|
| Open sessions limit |
The number of open sessions is close to a limit. The limit is set by the initialization parameter SESSIONS.
SESSIONS specifies the maximum number of sessions that can be created in the system. Because
every login requires a session, this parameter effectively determines the maximum number of
concurrent users in the system. You should always set this parameter explicitly to a value
equivalent to your estimate of the maximum number of concurrent users, plus the number of background
processes, plus approximately 10% for recursive sessions.
|
| Processes limit |
The number of Oracle operating system user processes is close to a limit. The limit is set by the initialization
parameter PROCESSES.
PROCESSES specifies the maximum number of operating system user processes that can simultaneously connect to
Oracle. Its value should allow for all background processes such as locks, job queue processes, and parallel
execution processes. Try increasing PROCESSES initialization parameter is you are approaching this limit.
|
| Open cursors limit |
The number of opened cursors is close to a limit. The limit is set by the initialization
parameter OPEN_CURSORS.
OPEN_CURSORS specifies the maximum number of open cursors (handles to private SQL areas) a session can have at once.
You can use this parameter to prevent a session from opening an excessive number of cursors. This parameter also
constrains the size of the PL/SQL cursor cache which PL/SQL uses to avoid having to reparse as statements are reexecuted by a user.
It is important to set the value of OPEN_CURSORS high enough to prevent your application from running out of open cursors. The
number will vary from one application to another. Assuming that a session does not open the number of cursors specified by
OPEN_CURSORS, there is no added overhead to setting this value higher than actually needed
|
| Sessions are waiting | A session or sessions are blocked and waiting to proceed longer than a predefined threshold.
You may need to check what specific event the sessions are waiting for and resolve the situation
according to the type of event
|
| Cannot establish new connection | Attempt to establish a new connection failed. Listener may be down or maximum number of open
session reached |
| Oracle Alert Log |
${errorCode}
|
| Average time to get a consistent read block over the interconnect | Average time to get a consistent read block over the interconnect is high. Check your interconnect performance |
| Average time to get a current block over the interconnect | Average time to get a current block over the interconnect is high. Check your interconnect performance |
| Average time to get a global lock | Average time to get a global lock is high |
| Blocks lost during transfer over interconnect |
This statistic shows block losses during transfers. High
values indicate network problems. The use of an unreliable IPC protocol such
as UDP may result in the value for global cache blocks lost being non-zero. When
this occurs, take the ratio of global cache blocks lost divided by global cache current
blocks served plus global cache cr blocks served. This ratio should be as small as
possible. Many times, a non-zero value for global cache blocks lost does not
indicate a problem because Oracle will retry the block transfer operation until
it is successful.
|
| Blocks corrupt during transfer over interconnect |
This statistic shows if any blocks were corrupted during
transfers. If high values are returned for this statistic, there is probably an IPC,
network, or hardware problem
|