曹耘豪的博客

HBase Meta预热

  1. 通过请求缓存
  2. 通过反射缓存

由于HBase在首次请求时会请求Meta表获取请求的row对应的Region,所以当仅HBaseClient创建好就接大量流量,导致大量访问报错,所以需要提前缓存meta。

HBaseClient并没有直接提供缓存meta的操作,有两种方式可以进行预热:

通过请求缓存

  1. 首先按照表名扫描Meta表,Meta表的Row具体格式是{table_name},{start_row},{region_id},所以我们使用{table_name},,扫描的第一行,{table_name} ,,作为扫描的最后一行(该行并不真实存在),而这之间的就是该表所有的Region
  2. 获取结果的每个Row
  3. Get请求这些Row,HBaseClient在首次请求时缓存每个Region的Meta
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
public void warmup(Connection connection, TableName tableName) throws IOException {
List<Get> getList = new ArrayList<>();
try (Table table = connection.getTable(TableName.META_TABLE_NAME)) {
Scan scan = new Scan();

String startRow = tableName.getNameAsString() + ",,";
String stopRow = tableName.getNameAsString() + " ,,";

scan.setStartRow(Bytes.toBytes(startRow));
scan.setStopRow(Bytes.toBytes(stopRow));
scan.addColumn(Bytes.toBytes("info"), Bytes.toBytes("serverstartcode"));
scan.setCaching(1000);
// 扫描Meta表,获取其中的RowKey
try (ResultScanner scanner = table.getScanner(scan)) {
for (Result result = scanner.next();
result != null && getList.size() < 100000;
result = scanner.next()) {

byte[] row = result.getRow();
if (row == null || row.length == 0) {
continue;
}
String[] rowArr = new String(row).split(",");
if (rowArr.length < 2) {
continue;
}
String startKey = rowArr[1];
if (startKey.length() == 0) {
startKey = "!";
}
getList.add(new Get(Bytes.toBytes(startKey)));
}
}
}

if (!getList.isEmpty()) {
// 请求每个Region的start key
try (Table table = connection.getTable(tableName)) {
table.get(getList);
}
}
log.info("Warmed meta by " + getList.size() + " GET(s) for the table " + tableName.getNameAsString());
}

通过反射缓存

既然是反射就需要类,而Connection是一个接口,所以先查看源码找到对应的实现类。目前主要的实现类是ConnectionManager.HConnectionImplementation,其中包含cacheLocation这个方法

  1. 首先使用MetaScanner.listTableRegionLocations请求Meta表,获取RegionLocations
  2. 调用内置方法cacheLocation直接缓存
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public static void hbaseWarmUpTableMeta(Connection connection, TableName tableName) throws Exception {
log.info("Warming up meta for table " + tableName.getNameAsString());

Method cacheLocationMethod = connection.getClass().getDeclaredMethod("cacheLocation", TableName.class, RegionLocations.class);
cacheLocationMethod.setAccessible(true);

int size = 0;
List<RegionLocations> regionLocationsList = MetaScanner.listTableRegionLocations(connection.getConfiguration(), connection, tableName)
for (RegionLocations regionLocations : regionLocationsList) {
size += regionLocations.size();
cacheLocationMethod.invoke(connection, tableName, regionLocations);
}

log.info("Warmed " + size + " region location(s) for table " + tableName.getNameAsString());
}

经过线上对比,使用反射缓存的方式(50~200ms)比使用请求的方式(500~1000ms)要快得多,可能是因为使用请求的方式需要进行两次请求,且Get请求很大时会比较慢

   /