HBase - advanced configuration on column family level
Block Size
HFile block size.Default is 64k,If you want to good sequential scan performance,it;s better to have larger block size.
Setting is during table creation
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', BLOCKSIZE => '65536'}
Or with code
On HColumnDescriptor there is a method : setBlocksize(int)
Block Cache
You can block cache for specific column family in order to improve caching for other column families for example.
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', BLOCKCACHE => 'false’}
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', IN_MEMORY => 'true'}
hbase(main):007:0> create 'mytable',
{NAME => 'colfam1', BLOOMFILTER => 'ROWCOL'}
A row-level bloom filter is enabled with ROW, and a qualifier-level bloom filter is enabled with ROWCOL
TTL
By defining Time To Live on some column family will delete the data after given amount of time
Example:
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', TTL => '18000'}
Data in colfam1 that is older than 5 hours is deleted during the next major compaction.
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', COMPRESSION => 'SNAPPY'}
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', VERSIONS => 5,
MIN_VERSIONS => '1'}
HFile block size.Default is 64k,If you want to good sequential scan performance,it;s better to have larger block size.
Setting is during table creation
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', BLOCKSIZE => '65536'}
Or with code
On HColumnDescriptor there is a method : setBlocksize(int)
Block Cache
You can block cache for specific column family in order to improve caching for other column families for example.
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', BLOCKCACHE => 'false’}
Aggresive caching
You can choose column families to be in highter priority for caching.hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', IN_MEMORY => 'true'}
Bloom filters
You enable bloom filters on the column family, like this:hbase(main):007:0> create 'mytable',
{NAME => 'colfam1', BLOOMFILTER => 'ROWCOL'}
A row-level bloom filter is enabled with ROW, and a qualifier-level bloom filter is enabled with ROWCOL
TTL
By defining Time To Live on some column family will delete the data after given amount of time
Example:
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', TTL => '18000'}
Data in colfam1 that is older than 5 hours is deleted during the next major compaction.
Compression
Compression defenition impacts HFiles and their data. This can save disk I/O and instead pay for higher CPU utilization.hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', COMPRESSION => 'SNAPPY'}
Cell versioning
By default 3 versions of values are saved. Can be changedhbase(main):002:0> create 'mytable', {NAME => 'colfam1', VERSIONS => 5,
MIN_VERSIONS => '1'}
No comments:
Post a Comment