由于HBase在首次请求时会请求Meta表获取请求的row对应的Region,所以当仅HBaseClient创建好就接大量流量,导致大量访问报错,所以需要提前缓存meta。
HBaseClient并没有直接提供缓存meta的操作,有两种方式可以进行预热:
请求一下每个Region的首行 反射HBaseClient拿到缓存的类和方法 通过请求缓存 首先按照表名扫描Meta表,Meta表的Row具体格式是{table_name},{start_row},{region_id}
,所以我们使用{table_name},,
扫描的第一行,{table_name} ,,
作为扫描的最后一行(该行并不真实存在),而这之间的就是该表所有的Region 获取结果的每个Row Get请求这些Row,HBaseClient在首次请求时缓存每个Region的Meta 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 public void warmup (Connection connection, TableName tableName) throws IOException { List<Get> getList = new ArrayList <>(); try (Table table = connection.getTable(TableName.META_TABLE_NAME)) { Scan scan = new Scan (); String startRow = tableName.getNameAsString() + ",," ; String stopRow = tableName.getNameAsString() + " ,," ; scan.setStartRow(Bytes.toBytes(startRow)); scan.setStopRow(Bytes.toBytes(stopRow)); scan.addColumn(Bytes.toBytes("info" ), Bytes.toBytes("serverstartcode" )); scan.setCaching(1000 ); try (ResultScanner scanner = table.getScanner(scan)) { for (Result result = scanner.next(); result != null && getList.size() < 100000 ; result = scanner.next()) { byte [] row = result.getRow(); if (row == null || row.length == 0 ) { continue ; } String[] rowArr = new String (row).split("," ); if (rowArr.length < 2 ) { continue ; } String startKey = rowArr[1 ]; if (startKey.length() == 0 ) { startKey = "!" ; } getList.add(new Get (Bytes.toBytes(startKey))); } } } if (!getList.isEmpty()) { try (Table table = connection.getTable(tableName)) { table.get(getList); } } log.info("Warmed meta by " + getList.size() + " GET(s) for the table " + tableName.getNameAsString()); }
通过反射缓存 既然是反射就需要类,而Connection
是一个接口,所以先查看源码找到对应的实现类。目前主要的实现类是ConnectionManager.HConnectionImplementation
,其中包含cacheLocation
这个方法
首先使用MetaScanner.listTableRegionLocations
请求Meta表,获取RegionLocations
调用内置方法cacheLocation
直接缓存 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 public static void hbaseWarmUpTableMeta (Connection connection, TableName tableName) throws Exception { log.info("Warming up meta for table " + tableName.getNameAsString()); Method cacheLocationMethod = connection.getClass().getDeclaredMethod("cacheLocation" , TableName.class, RegionLocations.class); cacheLocationMethod.setAccessible(true ); int size = 0 ; List<RegionLocations> regionLocationsList = MetaScanner.listTableRegionLocations(connection.getConfiguration(), connection, tableName) for (RegionLocations regionLocations : regionLocationsList) { size += regionLocations.size(); cacheLocationMethod.invoke(connection, tableName, regionLocations); } log.info("Warmed " + size + " region location(s) for table " + tableName.getNameAsString()); }
经过线上对比,使用反射缓存的方式(50~200ms)比使用请求的方式(500~1000ms)要快得多,可能是因为使用请求的方式需要进行两次请求,且Get请求很大时会比较慢
2018-12-05