Oracle Exadata x8m provides Optane persistent memory (PMEM) and in 2 ways PMEMCache and PMEMLog which is configured automatically with installation of the Exadata Software. Keep in mind that to take full advantages of the PMEMCache and PMEMLog you will need to be running Oracle Database 19c and above using and Exadata x8m with RoCE network. For databases below 19c blocks even with RoCE network Exadata will acesss the persistent memory via the pre-existing Exadata I/O path to the storage cells which can provide some improvement, but maximum performance advantage is 19c and above databases with Exadata and R0CE will using the Remote Direct Memory Access (RDMA) to access the persistent memory on the storage cells. The good news is there is no additional configuration, application changes other action you will need to take to get the advantages of the new PMEMCache and PMEMLog on the Exadata.
For persistent memory the data access speed is slower
than DRAM and faster than SSD. As a way
of comparison (this is approximate and meant to illustrate the large gains this
may vary some) DRAM is about 80-100 nanoseconds, Flash SSD storage around 200
microsecond and spinning hard disk is between 1-9 milliseconds, while
persistent memory access times sits at about 300 nanoseconds. This shows
persistent memory is about 3x slower than DRAM, but is 600x faster than Flash
SSD and 3,000x faster than
spinning hard disk. This will allow
Exadata to take advantage in some cases of faster than flash disk speed
improving performance by having a layer after DRAM and before the Flash Cache.
The persistently memory is automatically replicated
across all storage servers, this added multi-path access to all data in the
persistently memory, but also adds a large layer of resiliency. The persistent memory also is only accessible
via Oracle databases therefore using database access controls as using
persistent memory via OS or local access is not possible. By only allowing the database to access the
PMEM you can be ensured that the data is secure as database controls the access
and will maintenance the consistently and access control to the data. Exadata hardware monitoring and fault
management is performed via ILOM and includes persistent memory hardware
modules. When the time comes to remove
the Exadata or reinstall storage servers secure erase will automatically run on
persistent memory modules therefore ensuring that no data remains when
deinstall or reinstall is done.
The persistent memory (PMEM) PMEMCache adds a storage
tier of between DRAM (local server memory) and Flash (flashcache). The Exadata
X8M adds 1.5 TB of persistent memory to High Capacity and Extreme Flash Storage
Servers. The Persistent memory enables reads to happen at near local memory
speed, and ensures writes survive any power failures that may occur and can be
accessed via an 19c database from all database servers in the Exadata rack.
The Exadata X8M Storage Servers transparently manage the
persistent memory in front of flash memory.
The Exadata Database Servers running Oracle Database 19c or above
accesses the Optane persistent memory (PMEM) directly in the Exadata Storage
Servers which is made possible by the converged Ethernet (RoCE) switches with
100g capability and bypasses the network, storage controller, IO software,
interrupts, and context switches which delivers ≤ 19µs latency and as much as
16 million 8K IOPS per rack. Many database functions and all storage
functions are handled by the Exadata Storage Servers freeing up the resource on
the Exadata Database Servers improving performance.
Database Server (Database 19c or Above)
|
|
PMEM
-> Really Hot |
| |
FLASH -> Hot
| |
DISK -> Colder
In Oracle Database 19c the database can put the redo log
directly on the persistent memory (PMEM) PMEMLog on multiple storage servers using RDMA via the
converged Ethernet (RoCE) 100g network.
Keep in mind that is not the database’s entire redo log, it only
contains the recently written records.
Since the database uses RDMA for writing the redo logs the redo log
writes are up to an 8x faster. Since the redo log is going to PMEM and PMEM is
duplicated on multiple storage servers, it provides resilience for the
redo. So, for example if your database
seems high log file sync waits at times this could help with that issue. When on Exadata x8m and Oracle database 19c
it is not recommended to have storage cells in write-back mode due to the use
of the PMEMLog and would write to both when in write-back mode.
The Oracle Database 19c and above has AWR statistics from
various Exadata storage server components which includes the persistent memory
(PMEM) for both the PMEMCache and PMEMLog in addition to Smart I/O, Smart Flash
Cache, Smart Flash Log, PMEM Cache. It includes I/O statistics from both the
operating system and the cell server software, and it will perform outlier
analysis using the I/O statistics. Statistics from PMEM cache
different because the database issues RDMA I/O directly to PMEM cache and
does not go through cellsrv
, so the
storage server does not see the IOs via RDMA via R0CE therefore there are no
cell metrics for PMEM cache I/O. Instead, Oracle Database statistics
account for the I/O that is performed using RDMA.
** AWR Report Examples are from Oracle documentation
found here:
The AWR Report will have a section the shows the PMEM
configuration the example show with write through.
The AWR Report will report on PEME Cache space usage as a summary as well as detail per storage cell.
The section on the PMEM Cache Internal Writes are from the RDMA writes made to the PMEM that are populating the PMEM Cache.
We can also see PMEM information from each storage cells
using the cellcli command line utility for Example:
CellCLI>
LIST METRICDEFINITION ATTRIBUTES NAME,DESCRIPTION WHERE OBJECTTYPE = "PMEMCACHE";
PC_BY_ALLOCATED "Number of megabytes allocated in
PMEM cache"
CellCLI> list
metriccurrent where name = 'PC_BY_ALLOCATED' ;
PC_BY_ALLOCATED PMEMCACHE 1,541,436 MB
CellCLI> list
metriccurrent where name = 'DB_PC_BY_ALLOCATED' ;
DB_PC_BY_ALLOCATED ASM 0.000 MB
DB_PC_BY_ALLOCATED DT4DB1 802,271 MB
DB_PC_BY_ALLOCATED DT4DB2 14,096 MB
DB_PC_BY_ALLOCATED DT4DB3 141,912 MB
DB_PC_BY_ALLOCATED DT4DB4 154,608 MB
DB_PC_BY_ALLOCATED DT4DB5 426,958
MB
DB_PC_BY_ALLOCATED DT4DB6 1,506
MB
DB_PC_BY_ALLOCATED _OTHER_DATABASE_ 76.125 MB
CellCLI>
list metriccurrent where name like '.*PC.*';
DB_PC_BY_ALLOCATED ASM 0.000 MB
DB_PC_BY_ALLOCATED DT4ARIES 802,465 MB
DB_PC_BY_ALLOCATED DT4ETLSTG 14,114 MB
DB_PC_BY_ALLOCATED DT4MPI 141,867 MB