scutil --set HostName "localhost"
scutil --get HostName
Note: make sure "localhost" is set in your /etc/hosts file
Friday, September 12, 2014
Thursday, September 11, 2014
HBase Snapshot Restrictions
HBase Snapshot has two restrictions :
1. If in any case, regions got merged (usually manually) after the snapshot got taken, the snapshot would be invalid. Splitting regions are fine to snapshot.
2. If you restore data to create a table from snapshot, the replication on another cluster of that new table won't guarantee the data is integral.
1. If in any case, regions got merged (usually manually) after the snapshot got taken, the snapshot would be invalid. Splitting regions are fine to snapshot.
2. If you restore data to create a table from snapshot, the replication on another cluster of that new table won't guarantee the data is integral.
Monday, August 25, 2014
Setup local dev env
apply the pitch for os x
uname -a
copy the ip and put in /etc/hosts as an alias of localhost
setup dns servers in system preference -> network -> advanced -> dns
Run Service in debug mode
uname -a
copy the ip and put in /etc/hosts as an alias of localhost
setup dns servers in system preference -> network -> advanced -> dns
Run Service in debug mode
gradle :apps:[Service_Name]:run --debug-jvm -Pdebug
Tuesday, July 15, 2014
Thursday, July 10, 2014
Find class in jar
in lib do:
grep package/or/classname *
result :
abc.jar
look thru classes within it:
less abc.jar
shift+g : to to bottum
shift+n : go backward
space: go forward page by page
grep package/or/classname *
result :
abc.jar
look thru classes within it:
less abc.jar
shift+g : to to bottum
shift+n : go backward
space: go forward page by page
Tuesday, May 6, 2014
HBase - advanced configuration on column family level
HBase - advanced configuration on column family level
Block Size
HFile block size.Default is 64k,If you want to good sequential scan performance,it;s better to have larger block size.
Setting is during table creation
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', BLOCKSIZE => '65536'}
Or with code
On HColumnDescriptor there is a method : setBlocksize(int)
Block Cache
You can block cache for specific column family in order to improve caching for other column families for example.
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', BLOCKCACHE => 'false’}
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', IN_MEMORY => 'true'}
hbase(main):007:0> create 'mytable',
{NAME => 'colfam1', BLOOMFILTER => 'ROWCOL'}
A row-level bloom filter is enabled with ROW, and a qualifier-level bloom filter is enabled with ROWCOL
TTL
By defining Time To Live on some column family will delete the data after given amount of time
Example:
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', TTL => '18000'}
Data in colfam1 that is older than 5 hours is deleted during the next major compaction.
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', COMPRESSION => 'SNAPPY'}
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', VERSIONS => 5,
MIN_VERSIONS => '1'}
HFile block size.Default is 64k,If you want to good sequential scan performance,it;s better to have larger block size.
Setting is during table creation
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', BLOCKSIZE => '65536'}
Or with code
On HColumnDescriptor there is a method : setBlocksize(int)
Block Cache
You can block cache for specific column family in order to improve caching for other column families for example.
hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', BLOCKCACHE => 'false’}
Aggresive caching
You can choose column families to be in highter priority for caching.hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', IN_MEMORY => 'true'}
Bloom filters
You enable bloom filters on the column family, like this:hbase(main):007:0> create 'mytable',
{NAME => 'colfam1', BLOOMFILTER => 'ROWCOL'}
A row-level bloom filter is enabled with ROW, and a qualifier-level bloom filter is enabled with ROWCOL
TTL
By defining Time To Live on some column family will delete the data after given amount of time
Example:
hbase(main):002:0> create 'mytable', {NAME => 'colfam1', TTL => '18000'}
Data in colfam1 that is older than 5 hours is deleted during the next major compaction.
Compression
Compression defenition impacts HFiles and their data. This can save disk I/O and instead pay for higher CPU utilization.hbase(main):002:0> create 'mytable',
{NAME => 'colfam1', COMPRESSION => 'SNAPPY'}
Cell versioning
By default 3 versions of values are saved. Can be changedhbase(main):002:0> create 'mytable', {NAME => 'colfam1', VERSIONS => 5,
MIN_VERSIONS => '1'}
Thanks to author Rami Mankevich
Wednesday, April 16, 2014
Ops
1. disk info : df
io info : iostat
usage info: free
cpu info: top / htop (is pretty cool)
web io: iftop
io info : iostat
usage info: free
cpu info: top / htop (is pretty cool)
web io: iftop
Thursday, February 13, 2014
Redis Note
1) remove all keys with prefix
for key in `echo 'KEYS sectionId_*' | redis-cli | awk '{print $1}'`
do echo DEL $key
done | redis-cli
Reset ttl
for key in `echo 'KEYS *' \| redis-cli \| awk '{print $1}'`
do
echo expire $key 86400
done | redis-cli
for key in `echo 'KEYS sectionId_*' | redis-cli | awk '{print $1}'`
do echo DEL $key
done | redis-cli
Reset ttl
for key in `echo 'KEYS *' \| redis-cli \| awk '{print $1}'`
do
echo expire $key 86400
done | redis-cli
Tuesday, February 11, 2014
Edit Bash Prompt
Edit .bash_profile:
Between the quotation marks, you can add the following lines to customize your Terminal prompt:
- \d – Current date
- \t – Current time
- \h – Host name
- \# – Command number
- \u – User name
- \W – Current working directory (ie: Desktop/)
- \w – Current working directory, full path (ie: /Users/Admin/Desktop)
So, let’s say you want your Terminal prompt to display the User, followed by the hostname, followed by the directory, the .bashrc entry would be:
export PS1="\u@\h\w$ "
Wednesday, February 5, 2014
Generic HBase Export
code/analytics/etl/bing/bing-assembly/src/assembly/oozie/generic_hbase_export.xml
com.klout.bing.hbaseexport.HbaseTableExportMapper
Wednesday, January 29, 2014
Convert word doc to md file
textutil -convert html ~/Downloads/UniquereachImpressionBreakdown.docx -stdout | pandoc -f html -t markdown -o output.md
Monday, January 27, 2014
Bash Format and Send Email
printTable.sh
#!/bin/bash
#
# Script that output a hive table to a file
if [ $# -ne 3 ]
then
echo "Script that output a hive table to a file."
echo "Usage: $0 <db_name> <table_name> <output_path> "
exit 1
fi
DB_NAME=$1
TABLE_NAME=$2
OUTPUT_PATH=$3
echo "use $1; select * from $TABLE_NAME;" | /home/insights/insights/hive/hv | tail -n+3 | head -n-1 > $OUTPUT_PATH
if [ "$email" != "email" ]
then
for line in $(echo $email | tr ',' '\n')
do
ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no abc@host1 "(cd insights/bin; sh printTable.sh bi_insights tmp_kfb_user_meta ~/gfan/kfb_user_meta_$date.tsv && echo -e "brand_ks_uid\thandle\tbrand\temail\ttw_followers\tfp_likes\tnum_activities_90days\tcity\tstate\tcountry\tkdc_created_at" | cat - ~/gfan/kfb_user_meta_$date.tsv > ~/gfan/tmp && mv ~/gfan/tmp ~/gfan/kfb_user_meta_$date.tsv && zip ~/gfan/kfb_user_meta_$date.zip ~/gfan/kfb_user_meta_$date.tsv && echo 'Data Format: brand_ks_uid | handle | brand | email | tw_followers | fp_likes | num_activities_90days | city | state | country | kdc_created_at' | mutt -s 'KFB User Meta' $line -a ~/gfan/kfb_user_meta_$date.zip)"
done
fi
#!/bin/bash
#
# Script that output a hive table to a file
if [ $# -ne 3 ]
then
echo "Script that output a hive table to a file."
echo "Usage: $0 <db_name> <table_name> <output_path> "
exit 1
fi
DB_NAME=$1
TABLE_NAME=$2
OUTPUT_PATH=$3
echo "use $1; select * from $TABLE_NAME;" | /home/insights/insights/hive/hv | tail -n+3 | head -n-1 > $OUTPUT_PATH
if [ "$email" != "email" ]
then
for line in $(echo $email | tr ',' '\n')
do
ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no abc@host1 "(cd insights/bin; sh printTable.sh bi_insights tmp_kfb_user_meta ~/gfan/kfb_user_meta_$date.tsv && echo -e "brand_ks_uid\thandle\tbrand\temail\ttw_followers\tfp_likes\tnum_activities_90days\tcity\tstate\tcountry\tkdc_created_at" | cat - ~/gfan/kfb_user_meta_$date.tsv > ~/gfan/tmp && mv ~/gfan/tmp ~/gfan/kfb_user_meta_$date.tsv && zip ~/gfan/kfb_user_meta_$date.zip ~/gfan/kfb_user_meta_$date.tsv && echo 'Data Format: brand_ks_uid | handle | brand | email | tw_followers | fp_likes | num_activities_90days | city | state | country | kdc_created_at' | mutt -s 'KFB User Meta' $line -a ~/gfan/kfb_user_meta_$date.zip)"
done
fi
Friday, January 10, 2014
JUnit Test Note
Test on CL:
java -cp /Users/guangle/.m2/repository//junit/junit/4.8.1/junit-4.8.1.jar:target/thunder-libs-0.0.71.jar org.junit.runner.JUnitCore com.klout.thunder.common.UrlSummaryNormalizerTest
Subscribe to:
Posts (Atom)