首页 > 资讯 > 数据库 >HBase Filter 过滤器之 DependentColumnFilter 详解

918

分享到

HBase Filter 过滤器之 DependentColumnFilter 详解

HBase Filter 过滤器之 DependentColumnFilter 详解 2022-03-14 05:03:51 918人浏览绘本

摘要

前言：本文详细介绍了 HBase DependentColumnFilter 过滤器 Java&shell api 的使用，并贴出了相关示例代码以供参考。DependentColumnFilter 也称参考列过滤器，是一种允许用户

前言：本文详细介绍了 HBase DependentColumnFilter 过滤器 Java&shell api 的使用，并贴出了相关示例代码以供参考。DependentColumnFilter 也称参考列过滤器，是一种允许用户指定一个参考列或引用列来过滤其他列的过滤器，过滤的原则是基于参考列的时间戳来进行筛选。

该过滤器尝试找到该列所在的每一行，并返回该行具有相同时间戳的全部键值对;如果某行不包含这个指定的列，则什么都不返回。参数dropDependentColumn 决定参考列被返回还是丢弃，为true时表示参考列被返回，为false时表示被丢弃。可以把DependentColumnFilter理解为一个valueFilter和一个时间戳过滤器的组合。如果想要获取同一时间线的数据可以考虑使用此过滤器。比较器细节及原理请参照之前的更文：HBase Filter 过滤器之比较器 Comparator 原理及源码学习。

一。Java Api

头部代码

public class DependentColumnFilterDemo {

    private static boolean isok = false;
    private static String tableName = "test";
    private static String[] cfs = new String[]{"f1", "f2"};
    private static String[] data1 = new String[]{"row-1:f2:c3:1234abc56", "row-3:f1:c3:1234321"};
    private static String[] data2 = new String[]{
            "row-1:f1:c1:abcdefg", "row-1:f2:c2:abc", "row-2:f1:c1:abc123456", "row-2:f2:c2:1234abc567"
    };

    public static void main(String[] args) throws ioException, InterruptedException {

        MyBase myBase = new MyBase();
        Connection connection = myBase.createConnection();
        if (isok) {
            myBase.deleteTable(connection, tableName);
            myBase.createTable(connection, tableName, cfs);
            // 造数据
            myBase.putRows(connection, tableName, data1);  // 第一批数据
            Thread.sleep(10);
            myBase.putRows(connection, tableName, data2);  // 第二批数据
        }
        Table table = connection.getTable(TableName.valueOf(tableName));
        Scan scan = new Scan();

中部代码
向右滑动滚动条可查看输出结果。

        // 构造方法一
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"));  // [row-1:f1:c1:abcdefg, row-1:f2:c2:abc, row-2:f1:c1:abc123456, row-2:f2:c2:1234abc567]

        // 构造方法二 boolean dropDependentColumn=true
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), true);  // [row-1:f2:c2:abc, row-2:f2:c2:1234abc567]

        // 构造方法二 boolean dropDependentColumn=false  默认为false
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false); // [row-1:f1:c1:abcdefg, row-1:f2:c2:abc, row-2:f1:c1:abc123456, row-2:f2:c2:1234abc567]

        // 构造方法三 + BinaryComparator 比较器过滤数据
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,
                CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("abcdefg"))); // [row-1:f1:c1:abcdefg, row-1:f2:c2:abc]

        // 构造方法三 + BinaryPrefixComparator 比较器过滤数据
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,
                CompareFilter.CompareOp.EQUAL, new BinaryPrefixComparator(Bytes.toBytes("abc")));  // [row-1:f1:c1:abcdefg, row-1:f2:c2:abc, row-2:f1:c1:abc123456, row-2:f2:c2:1234abc567]

        // 构造方法三 + SubstrinGComparator 比较器过滤数据
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,
                CompareFilter.CompareOp.EQUAL, new SubstringComparator("1234"));  // [row-2:f1:c1:abc123456, row-2:f2:c2:1234abc567]

        // 构造方法三 + RegexStringComparator 比较器过滤数据
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,
                CompareFilter.CompareOp.EQUAL, new RegexStringComparator("[a-z]"));  // [row-1:f1:c1:abcdefg, row-1:f2:c2:abc, row-2:f1:c1:abc123456, row-2:f2:c2:1234abc567]

        // 构造方法三 + RegexStringComparator 比较器过滤数据
        DependentColumnFilter filter = new DependentColumnFilter(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,
                CompareFilter.CompareOp.EQUAL, new RegexStringComparator("1234[a-z]"));  // []  思考题：与上例对比，想想为什么为空？

该过滤器同时也支持各比较器的不同比较语法，同之前介绍的各种过滤器是一样的，这里不再一一举例了。

尾部代码

		scan.setFilter(filter);
        ResultScanner scanner = table.getScanner(scan);
        Iterator iterator = scanner.iterator();
        LinkedList keys = new LinkedList<>();
        while (iterator.hasNext()) {
            String key = "";
            Result result = iterator.next();
            for (Cell cell : result.rawCells()) {
                byte[] rowkey = CellUtil.cloneRow(cell);
                byte[] family = CellUtil.cloneFamily(cell);
                byte[] column = CellUtil.cloneQualifier(cell);
                byte[] value = CellUtil.cloneValue(cell);
                key = Bytes.toString(rowkey) + ":" + Bytes.toString(family) + ":" + Bytes.toString(column) + ":" + Bytes.toString(value);
                keys.add(key);
            }
        }
        System.out.println(keys);
        scanner.close();
        table.close();
        connection.close();
    }
}

二。Shell Api

HBase test 表数据一览：

hbase(main):009:0> scan "test"
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-1                                           column=f2:c3, timestamp=1589794115241, value=1234abc56
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
 row-3                                           column=f1:c3, timestamp=1589794115241, value=1234321
3 row(s) in 0.0280 seconds

0. 简单构造方法

hbase(main):006:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0450 seconds

hbase(main):008:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",false)"}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0310 seconds

hbase(main):007:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",true)"}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0250 seconds

1. BinaryComparator 构造过滤器

方式一：

hbase(main):004:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",false,=,"binary:abcdefg")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
1 row(s) in 0.0330 seconds

hbase(main):005:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",true,=,"binary:abcdefg")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
1 row(s) in 0.0120 seconds

支持的比较运算符：= != > >= < <=，不再一一举例。

方式二：

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.BinaryComparator
import org.apache.hadoop.hbase.filter.DependentColumnFilter

hbase(main):016:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,CompareFilter::CompareOp.valueOf("EQUAL"), BinaryComparator.new(Bytes.toBytes("abcdefg")))}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
1 row(s) in 0.0170 seconds

hbase(main):017:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), true,CompareFilter::CompareOp.valueOf("EQUAL"), BinaryComparator.new(Bytes.toBytes("abcdefg")))}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
1 row(s) in 0.0140 seconds

支持的比较运算符：LESS、LESS_OR_EQUAL、EQUAL、NOT_EQUAL、GREATER、GREATER_OR_EQUAL，不再一一举例。

推荐使用方式一，更简洁方便。

2. BinaryPrefixComparator 构造过滤器

方式一：

hbase(main):019:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",false,=,"binaryprefix:abc")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0330 seconds

hbase(main):020:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",true,=,"binaryprefix:abc")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0600 seconds

方式二：

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.BinaryPrefixComparator
import org.apache.hadoop.hbase.filter.DependentColumnFilter

hbase(main):023:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,CompareFilter::CompareOp.valueOf("EQUAL"), BinaryPrefixComparator.new(Bytes.toBytes("abc")))}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0180 seconds

hbase(main):022:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), true,CompareFilter::CompareOp.valueOf("EQUAL"), BinaryPrefixComparator.new(Bytes.toBytes("abc")))}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0190 seconds

其它同上。

3. SubstringComparator 构造过滤器

方式一：

hbase(main):025:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",false,=,"substring:abc")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0340 seconds

hbase(main):024:0> scan "test",{FILTER=>"DependentColumnFilter("f1","c1",true,=,"substring:abc")"}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0160 seconds

方式二：

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.DependentColumnFilter

hbase(main):028:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,CompareFilter::CompareOp.valueOf("EQUAL"), SubstringComparator.new("abc"))}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0150 seconds

hbase(main):029:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), true,CompareFilter::CompareOp.valueOf("EQUAL"), SubstringComparator.new("abc"))}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0170 seconds

区别于上的是这里直接传入字符串进行比较，且只支持EQUAL和NOT_EQUAL两种比较符。

4. RegexStringComparator 构造过滤器

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.RegexStringComparator
import org.apache.hadoop.hbase.filter.DependentColumnFilter

hbase(main):035:0> scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), false,CompareFilter::CompareOp.valueOf("EQUAL"), RegexStringComparator.new("[a-z]"))}
ROW                                              COLUMN+CELL
 row-1                                           column=f1:c1, timestamp=1589794115268, value=abcdefg
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f1:c1, timestamp=1589794115268, value=abc123456
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0170 seconds

hbase(main):034:0* scan "test",{FILTER => DependentColumnFilter.new(Bytes.toBytes("f1"), Bytes.toBytes("c1"), true,CompareFilter::CompareOp.valueOf("EQUAL"), RegexStringComparator.new("[a-z]"))}
ROW                                              COLUMN+CELL
 row-1                                           column=f2:c2, timestamp=1589794115268, value=abc
 row-2                                           column=f2:c2, timestamp=1589794115268, value=1234abc567
2 row(s) in 0.0150 seconds

该比较器直接传入字符串进行比较，且只支持EQUAL和NOT_EQUAL两种比较符。若想使用第一种方式可以传入regexstring试一下，我的版本有点低暂时不支持，不再演示了。

注意这里的正则匹配指包含关系，对应底层find()方法。

DependentColumnFilter不支持使用LongComparator比较器，且BitComparator、NullComparator比较器用之甚少，也不再介绍。

到此为止，所有的比较过滤器就总结完毕了。

查看文章全部源代码请访以下GitHub地址：

https://github.com/zhoupengbo/demos-bigdata/blob/master/hbase/hbase-filters-demos/src/main/java/com/zpb/demos/DependentColumnFilterDemo.java

扫描二维码关注博主公众号

转载请注明出处！欢迎关注本人微信公众号【HBase工作笔记】

您可能感兴趣的文档:

点击免费下载>>软考高级考试备考技巧/历年真题/备考精华资料

--结束END--

本文标题: HBase Filter 过滤器之 DependentColumnFilter 详解

本文链接: https://www.lsjlt.com/news/6175.html(转载时请注明来源链接)

有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

本篇文章演示代码以及资料文档资料下载

下载Word文档到电脑，方便收藏和打印～

下载Word文档

去做题

回答

如何调试操作系统的错误？
操作系统

2023-11-15发布

回答

操作系统中的I/O系统是如何实现的？
操作系统

2023-11-15发布

回答

如何实现操作系统的内存管理？
操作系统

2023-11-15发布

回答

什么是虚拟内存，它对操作系统有什么影响？
操作系统

2023-11-15发布

回答

ASP中的MVC架构和WebForms架构有什么区别和使用场景？
ASP.NET

2023-11-15发布

回答

ASP中的数据验证和数据校验有什么不同？
ASP.NET

2023-11-15发布

回答

ASP中的ADO对象和DAO对象有什么区别和使用方法？
ASP.NET

2023-11-15发布

回答

Node.js中的包管理器NPM是什么？如何使用它进行依赖管理？
node.js

2023-11-15发布

回答

Vue.js中的动态组件是什么？如何使用它来动态渲染组件？
VUE

2023-11-15发布

回答

如何使用Vue.js实现懒加载和预加载？
VUE

2023-11-15发布

HBase Filter 过滤器之 DependentColumnFilter 详解

一。Java Api

二。Shell Api

0. 简单构造方法

1. BinaryComparator 构造过滤器

2. BinaryPrefixComparator 构造过滤器

3. SubstringComparator 构造过滤器

4. RegexStringComparator 构造过滤器

本篇文章演示代码以及资料文档资料下载

HBase Filter 过滤器之 DependentColumnFilter 详解

HBase Filter 过滤器之RowFilter详解

HBase Filter 过滤器之FamilyFilter详解

HBase Filter 过滤器之QualifierFilter详解

HBase Filter 过滤器之 ValueFilter 详解

详解Servlet之过滤器（Filter）

HBase Filter 过滤器概述

Java过滤器Filter详解

HBase Filter 过滤器之 Comparator 原理及源码学习

servlet过滤器(Filter)详解（九）

一文详解JavaWeb过滤器(Filter)

详解JavaWeb中的过滤器Filter

Vue中的过滤器(filter)详解

详解JavaWeb过滤器Filter问题解决

Filter过滤器和Listener监听器详解

PHP伪协议filter详解，php://filter协议过滤器

JavaWeb中过滤器Filter的用法详解

SpringBoot过滤器Filter使用实例详解

Vue之过滤器详解

Spring Boot之过滤器 Filter注入的方式解析

mysql中groupby是什么意思

mysql中having是什么意思

mysql中groupby和having的关系

mysql中and的用法

mysql中having的作用

mysql中gbk什么意思

navicat怎么连接云数据库

navicat表与表之间怎么用视图连接起来

navicat数据库主键怎么设置

navicat怎么回退数据库操作