首页 > 资讯 > 数据库 >12C新特性___In-Memory列式存储的总结

452

分享到

12C新特性___In-Memory列式存储的总结

2024-04-02 19:04:59 452人浏览薄情痞子

摘要

官方文档 https://docs.oracle.com/en/database/oracle/oracle-database/12.2/inmem/concepts-for-the-im-column-s

官方文档

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/inmem/concepts-for-the-im-column-store.html#GUID-5A72B48A-8427-41AE-9220-E46042BC90C4

Https://docs.oracle.com/en/database/oracle/oracle-database/12.2/inmem/configuring-the-im-column-store.html#GUID-8844C889-E381-4B77-8A51-7AA6462B14D7

The IM column store encodes data in a columnar fORMat: each column is a separate structure. The columns are stored contiguously, which optimizes them for analytic queries. The database buffer cache can modify objects that are also populated in the IM column store. However, the buffer cache stores data in the traditional row format. Data blocks store the rows contiguously, optimizing them for transactions.

When you enable an IM column store, the SGA manages data in separate locations: the In-Memory Area and the database buffer cache.

The IM column store maintains copies of tables, partitions, and individual columns in a special compressed columnar format that is optimized for rapid scans.

In-Memory Column Store意思就是In-Memory列式存储，每一列都是一个单独的结构，它的优点就是只需要访问表的部分列，不像database buffer cache以传统的行格式存储数据，需要访问表的所有列。但是传统的database buffer cache也可以修改填充在In-Memory内存中的对象。

In-Memory列式存储特性开启后数据库启动时会在SGA中分配一块静态的内存池In-Memory Area，用于存放以In-Memory列式存储的用户表。

In-Memory列式存储以一种特殊的压缩列格式维护表、分区和单个列的副本，这种格式是为快速扫描而优化的。

In memory内存中的数据的同步机制

一旦加载到In memory内存中的表涉及DML了，就需要一种机制保证In memory内存中的数据的一致性，因为DML语句的修改在内存中仅修改database buffer cache和log buffer，如何把这些修改的数据同步到In memory内存中呢。Oracle 是通过Transaction journal来确保数据的一致性的。如果DML语句修改的表已经存在In memory内存中，在DML提交后就把该DML的元数据比如表名tablename和行号rowid记录到transaction journal，并把该表在In memory内存中的SCN标识为过期。如果后面新的查询需要访问该表在In memory内存中的数据，就会根据该表原来在In memory内存中的数据+transaction journal+database buffer cache进行访问

当然,如果DML语句不断发生的话，就会使transaction journal的数据越来越多，甚至出现In memory内存中的大部分数据都是过期的旧数据，这对于in memory查询的性能伤害是很大的。所以，Oracle定义了一个阀值staleness threshold,当in memory中旧数据的比例达到这个阀值时就会触发Repopulate的过程，oracle默认2分钟就会检查一次是否触发了该阀值

In-Memory列式存储涉及参数

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/inmem/init-parameters-for-im-column-store.html#GUID-A67ABCAC-C6B9-499E-8AE0-BD7922B239BE

In-Memory列式存储涉及的视图

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/inmem/views-related-to-im-column-store.html#GUID-2EBF8D9B-FA9E-4D67-8934-5908E6018D4E

关于In-Memory的一些总结

1、数据库级别启用In-Memory列式存储的两个前提条件：MEMORY_TARGET必须设置且大于100M;COMPATIBLE参数必须设置且大于12.1.0

2、表空间、表、分区和物化视图都可以启用In-Memory列式存储,当前表空间启用In-Memory列式存储后，默认为该表空间下以后新增所有表和物化视图都启用了In-Memory列式存储，该表空间下之前已经存在的表不受影响,设置表空间启用In-Memory列式存储时INMEMORY关键字前面必须加default

3、表级别启用In-Memory列式存储的前提条件：create table或alter table时指定了INMEMORY

4、查询表是否启用In-Memory列式存储,参见USER_TABLES.INMEMORY是否等于'ENABLED',等于ENABLED说明启用了

5、表已经启用In-Memory列式存储不代表该表的数据就已经自动加载到In-Memory内存中,只有在实例启动或访问该对象时才会加载到In-Memory内存中，如果想把表数据立即加载到In-Memory内存中，则对该表强制执行全表扫描或使用DBMS_INMEMORY.POPULATE即可。只要对象In-Memory列式存储的PRIORITY的级别不是none，则实例启动或该对象对应的PDB启动时会自动加载该对象到In-Memory内存中，查看表的数据是否已经进入了In-Memory内存区，参见V$IM_SEGMENTS.SEGMENT_NAME。某表已经存在V$IM_SEGMENTS的话，truncate table后V$IM_SEGMENTS中该表记录消失，delete table后V$IM_SEGMENTS中该表记录还在

6、12.2.0版本开始可以使用ILM ADO POLICY对In-Memory列式存储进行相应设置，ILM ADO POLICY在数据库级别生效，而不是实例级别，Information Lifecycle Management (ILM) Automatic Data Optimization (ADO) POLICY信息生命周期管理自动数据优化政策意思就是可以决定In-Memory列式存储在哪张表上什么时候什么情况下生效，什么时候什么情况下失效。ALTER TABLE TABLE_NAME ILM ADD POLICY SET|MODIFY|NO INMEMORY

7、可以只把表的特定字段列启用In-Memory，使用inmemory指定这些特定字段，同时必须使用no inmemory把剩余的列写进去，字段列启用In-Memory的话，其中列的类型不能是LONG or LONG RAW column, an out-of-line column (LOB, varray, nested table column), or an extended data type column，某表只有部分字段列启用In-Memory的话，通过USER_TABLES.INMEMORY='ENABLED'查不到该表,可以通过V$IM_COLUMN_LEVEL.INMEMORY_COMPRESSION<>'NO INMEMORY'来查

8、无法使用In-Memory列式存储的对象有：Indexes、Index-organized tables、Hash clusters、Objects owned by the SYS user and stored in the SYSTEM or SYSAUX tablespace、If you enable a table for the IM column store and it contains any of the following types of columns, then these columns will not be populated in the IM column store:Out-of-line columns (varrays, nested table columns, and out-of-line LOBs)、Columns that use the LONG or LONG RAW data types、Extended data type columns

9、如果不指定inmemory的priority优先级别,默认是none，则只有全表扫描访问对象时才会把该对象放入In-Memory内存中。通过索引扫描或通过rowid获取该对象都不会把该对象放入In-Memory内存中。如果priority级别不是none，则在数据库启动过程中会自动把对象In-Memory放入内存中，或根据优先级别把对象放入In-Memory内存中

10、如果不指定inmemory的MEMCOMPRESS压缩级别,默认是MEMCOMPRESS FOR QUERY LOW

11、如果不指定DUPLICATE时，默认就是NO DUPLICATE，只有RAC环境且是Oracle Engineered System环境才能使用DUPLICATE或DUPLICATE ALL，否则就算是使用了DUPLICATE或DUPLICATE ALL也不起作用，还是当成NO DUPLICATE.

12、如果不指定distribute时,默认是auto,默认存在IM中的表会分布在各个节点之中。只有RAC环境才能使用distribute

13、关于populate和repopulate的区别，populate是把磁盘上的现有数据转换为列格式并存放到In-Memory内存中，repopulate是把将新数据加载到In-Memory内存中，可以简单理解为populate初始化全量刷数据进入In-Memory内存中，repopulate是增量刷数据进入In-Memory内存中

一些实验结果

1、表空间设置为inmemory

创建表空间或修改表空间为inmemory，inmemory关键字前面必须加上default

sql> create tablespace tablespace1 datafile '/u02/data/tablespace2.dbf' size 100M inmemory;

ERROR at line 1:

ORA-02180: invalid option for CREATE TABLESPACE

SQL> create tablespace tablespace1 datafile '/u02/data/tablespace2.dbf' size 100M default inmemory;

Tablespace created.

SQL> alter tablespace USERS inmemory;

ERROR at line 1:

ORA-02142: missing or invalid ALTER TABLESPACE option

SQL> alter tablespace USERS default inmemory;

Tablespace altered.

2、表设置为inmemory

如果create table as方式，则inmemory放在as前面

create table table1 (hid number(10)) inmemory;

alter table table2 inmemory;

create table t4 inmemory as select * from t1;--t4启用了In-Memory列式存储

create table t5 as select * from t1 inmemory;--t5没有启用了In-Memory列式存储

3、物化视图设置为inmemory

create materialized view mview1 inmemory as select * from table1;

alter materialized view mview2 inmemory

4、分区表某些分区设置为inmemory

建表是最后两个分区SALES_Q4_2019、SALES_Q1_2020都没有启用In-Memory列式存储，参见user_tab_partitions.inmemory，最后修改SALES_Q4_2019分区启用In-Memory列式存储

CREATE TABLE sales1( prod_id NUMBER(6),time_id DATE,channel_id varchar2(100))

PARTITION BY RANGE (time_id)

(PARTITION SALES_Q1_2019

VALUES LESS THAN (TO_DATE('01-APR-2019','DD-MON-YYYY')) INMEMORY,

PARTITION SALES_Q2_2019

VALUES LESS THAN (TO_DATE('01-JUL-2019','DD-MON-YYYY')) INMEMORY,

PARTITION SALES_Q3_2019

VALUES LESS THAN (TO_DATE('01-OCT-2019','DD-MON-YYYY')) INMEMORY,

PARTITION SALES_Q4_2019

VALUES LESS THAN (TO_DATE('01-JAN-2020','DD-MON-YYYY')) NO INMEMORY,

PARTITION SALES_Q1_2020

VALUES LESS THAN (MAXVALUE));

alter table sales1 modify partition SALES_Q4_2019 inmemory;

5、字段列设置为inmemory

如下创建的表table1,只有CREATED_APPID字段没有启用In-Memory列式存储，其他列都启用了

所以一张表只要某些列设置为inmemory时，必须使用no inmemory把剩余的列写进去

create table table1 as select * from dba_objects;

alter table table1 inmemory (OWNER) no inmemory (CREATED_APPID);

When a database is restarted, all of the data for database objects with a priority level other than NONE are populated in the IM column store during startup.

重新启动数据库后，在启动期间，优先级比NONE高的数据库对象的所有数据都将加载进入In-Memory中。

Population

The operation of reading existing data blocks from data files, transforming the rows into columnar format, and then writing the columnar data to the IM column store. In contrast, loading refers to bringing new data into the database using DML or DDL.

Population, which transforms existing data on disk into columnar format, is different from repopulation, which loads new data into the IM column store. Because IMCUs are read-only structures, Oracle Database does not populate them when rows change. Rather, the database records the row changes in a transaction journal, and then creates new IMCUs as part of repopulation

从数据文件读取现有数据块，将行转换为列格式，然后将列数据写入IM列存储的操作。相反，loading是指使用DML或DDL将新数据带入数据库。

Population是将磁盘上的现有数据转换为列格式，Population不同于将新数据加载到IM列存储中的repopulation。因为IMCU是只读结构，所以当行更改时，Oracle数据库不会填充它们。而是，数据库将行更改记录在transaction journal中，然后创建新的IMCU作为repopulation的一部分

IMCU

An In-Memory Compression Unit (IMCU) is a compressed, read-only storage unit that contains data for one or more columns.

内存中压缩单元（IMCU）是一种压缩的只读存储单元，其中包含一个或多个列的数据。

Transaction journal

Metadata in a Snapshot Metadata Unit (SMU) that keeps the IM column store transactionally consistent.

快照元数据单元（SMU）中的元数据，可以使IM列存储在事务上保持一致。

Every SMU contains a transaction journal. The database uses the transaction journal to keep the IMCU transactionally consistent.

The database uses the buffer cache to process DML, just as when the IM column store is not enabled. For example, an UPDATE statement might modify a row in an IMCU. In this case, the database adds the rowid for the modified row to the transaction journal and marks it stale as of the SCN of the DML statement. If a query needs to access the new version of the row, then the database obtains the row from the database buffer cache.

The database achieves read consistency by merging the contents of the column, transaction journal, and buffer cache. When the IMCU is refreshed during repopulation, queries can access the up-to-date row directly from the IMCU.

每个SMU都包含一个transaction journal。数据库使用transaction journal来使IMCU保持事务一致。

与未启用IM列存储时一样，数据库使用缓冲区高速缓存来处理DML。例如，一条UPDATE语句可能会修改IMCU中的一行。在这种情况下，数据库将已修改行的行标识添加到transaction journal中，并从DML语句的SCN开始将其标记为过期。如果查询需要访问该行的新版本，则数据库从数据库缓冲区高速缓存中获取该行。

数据库通过合并列，transaction journal和缓冲区高速缓存的内容来实现读取一致性。在重新填充期间刷新IMCU时，查询可以直接从IMCU访问最新行。

Repopulation

The automatic refresh of a currently populated In-Memory Compression Unit (IMCU) after its data has been significantly modified. In contrast, population is the initial creation of IMCUs in the IM column store.

在当前populated内存中压缩单元（IMCU）的数据进行了重大修改之后，它会自动刷新。相反，population是IM列存储中IMCU的初始创建。

您可能感兴趣的文档:

点击免费下载>>软考高级考试备考技巧/历年真题/备考精华资料

--结束END--

本文标题: 12C新特性___In-Memory列式存储的总结

本文链接: https://www.lsjlt.com/news/47406.html(转载时请注明来源链接)

有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

本篇文章演示代码以及资料文档资料下载

下载Word文档到电脑，方便收藏和打印～

下载Word文档

去做题

猜你喜欢

12C新特性___In-Memory列式存储的总结

官方文档 https://docs.oracle.com/en/database/oracle/oracle-database/12.2/inmem/concepts-for-the-im-column-s...

99+

2024-04-02
oracle 12c 列式存储 ( In Memory 理论)

随着Oracle 12c推出了in memory组件，使得Oracle数据库具有了双模式数据存放方式，从而能够实现对混合类型应用的支持：传统的以行形式保存的数据满足OLTP应用；列形式保存的数据满足以查询...

99+

2024-04-02
Oracle 数据库12c新特性总结

本篇内容介绍了“Oracle 数据库12c新特性总结”的有关知识，在实际案例的操作过程中，不少人都会遇到这样的困境，接下来就让小编带领大家学习一下如何处理这些情况吧！希望大家仔细阅读，能够学有所成！　　1....

99+

2024-04-02
ORACLE 12C 优化器的一些新特性总结（二）

Oracle 12c 数据库在优化器方面确实做出了很大进步。在 Oracle 12c 数据库众多特性中，自适应查询优化是较大的功能变化了。它使优化器能够对执行计划进行实时调整。当现有的统...

99+

2024-04-02
PHP7新特性的总结

这篇文章主要介绍“PHP7新特性的总结”，在日常操作中，相信很多人在PHP7新特性的总结问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”PHP7新特性的总结”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！ph...

99+

2023-06-14
Java8新特性:lambda表达式总结

一、Lambda 表达式的基础语法 Lambda 表达式的基础语法：Java8中引入了一个新的操作符 "->" 该操作符称为箭头操作符或 Lambda 操作符箭头操作符将 La...

99+

2024-04-02
mysql中Memory存储引擎的特性有哪些

这篇文章给大家分享的是有关mysql中Memory存储引擎的特性有哪些的内容。小编觉得挺实用的，因此分享给大家做个参考，一起跟随小编过来看看吧。Memory表的每个表可以有多达32个索引。每个索引16列，以及500字节的最大键长度。存储引擎...

99+

2023-06-25
Java8新特性Lambda表达式的一些复杂用法总结

简介lambda表达式是JAVA8中提供的一种新的特性，它支持Java也能进行简单的“函数式编程”。它是一个匿名函数，Lambda表达式基于数学中的λ演算得名，直接对应于其中的lambda抽象(lambda abstraction)，是一...

99+

2023-05-31

java8 lambda表达式 ava
简单易懂的java8新特性之lambda表达式知识总结

目录一、概念二、用法比较2.1 实现类2.2 匿名类2.3 Lambda三、基本用法3.1 无参数无返回值接口方法3.2 一个参数无返回值接口方法3.3 多个参数无返回值接口方法3....

99+

2024-04-02
Oracle 12c新特性之怎么检测有用的多列统计信息

这篇文章给大家分享的是有关Oracle 12c新特性之怎么检测有用的多列统计信息的内容。小编觉得挺实用的，因此分享给大家做个参考，一起跟随小编过来看看吧。一、环境准备首先，我们创建测试表customers_...

99+

2024-04-02
Vue3.2单文件组件setup的语法糖与新特性总结

目录前言setup语法糖一、基本用法二、data和methods三、计算属性computed四、监听器watch、watchEffect五、自定义指令directive六、impor...

99+

2024-04-02
C语言数据结构之线性表的链式存储结构

1.什么是线性表的链式存储结构 —链表存储结点：包括元素本身的信息，还有元素之间的关系逻辑的信息这个结点有：数据域和指针域一个指针域：指向后继结点，单链表二个指针域：指向...

99+

2024-04-02
SQL中的游标、异常处理、存储函数及总结(最新推荐)

目录一.游标格式操作演示二.异常处理—handler句柄格式演示三.存储函数格式参数说明演示四.存储过程总结一.游标游标...

99+

2023-02-16

SQL中的游标 SQL异常处理 SQL存储函数
如何理解C语言数据结构中线性表的链式存储结构

如何理解C语言数据结构中线性表的链式存储结构，相信很多没有经验的人对此束手无策，为此本文总结了问题出现的原因和解决方法，通过这篇文章希望你能解决这个问题。1.什么是线性表的链式存储结构 —链表存储结点：包括元素本身的信息，还有元素之间的关系...

99+

2023-06-21
构建高性能的数据存储与检索系统：Go语言开发经验总结

构建高性能的数据存储与检索系统：Go语言开发经验总结引言：随着大数据和云计算时代的到来，数据存储和检索成为了现代计算的重要组成部分。构建高性能的数据存储与检索系统，是提高计算效率和数据处理速度的重要手段之一。本文将从Go语言开发的角度，总结...

99+

2023-11-20

Go语言构建高性能数据存储检索系统
Swoole和Workerman的消息队列与分布式数据存储的高可用性和数据一致性

一、高可用性高可用性是指系统在遇到故障或者异常情况下仍然能够继续正常运行的能力。在消息队列和分布式数据存储中，高可用性是至关重要的，因为它直接关系到系统的稳定性和可靠性。Swoole的高可用性Swoole提供了多种方式来实现高可用性，下面是...

99+

2023-10-21

swoole 高可用性 Workerman