Print all processes :
ps -ef
Bash Tool Crontab
1. To edit cron config file :
crontab -e
2. To print cron config file :
crontab -l
Thursday, March 28, 2013
Wednesday, March 27, 2013
Bash pipe direct
To redirect stdout in bash, overwriting file
cmd > file.txt
To redirect stdout in bash, appending to file
cmd >> file.txt
To redirect both stdout and stderr, overwriting
cmd &> file.txt
redirect both stdout and stderr appending to file
cmd >>file.txt 2>&1
Monday, March 25, 2013
Java heap space or GC out of limit issue
set hive.map.aggr=true; set hive.map.aggr.hash.force.flush.memory.threshold=0.75; set hive.map.aggr.hash.percentmemory=0.3; set hive.groupby.mapaggr.checkinterval=10000; set mapred.child.java.opts=-Xmx3072M; set hive.exec.compress.output=true; set io.seqfile.compression.type=BLOCK;
all of those are good param
except maybe set hive.exec.compress.output=true;
and
set io.seqfile.compression.type=BLOCK;
Wednesday, March 20, 2013
Awk
cat ~/12 | awk '{print "/data/prod/"$1}'
echo "list_snapshots" | hbase shell | egrep "\([A-Za-z]{3} [A-Za-z]{3} [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} \+[0-9]{4} [0-9]{4}\)" | grep $(date +%b) | awk -F ' ' '{printf ("delete_snapshot '\''%s'\''\n", $1)}' | hbase shell
cat 1 | grep -v main | grep "\[.*\]" | egrep -o "\"[^,|^\"]*\"" | tr -d '"' | awk -v date=$dateString '{printf("snapshot '\''%s'\'', '\''%s-snapshot-%s'\''\n", $1, $1, date)}'
Oozie job weird error : No input path Specified
Today I debugged with a couple guys on a strange oozie error. The MapRed job with error
"No input path Specified"
But we have the input dir set up in configuration and workflow.
It turned out to be that we missed two configurations for oozie job to tell oozie use the new api :
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
Tuesday, March 19, 2013
Bash Quich Note
1. Each line in a file:
for line in $(cat 2); do echo $line; done;
2. Sed
prefer to use '|' as delimiter if possible:
sed 's|my/home/directory||g' < in > out
in place replacement :
sed -i 's|analytics/etl/maxwell/src/assembly/hive/maxwell/||g' in
3. Sed replace \n to ,\n
sed ':a;N;$!ba;s/\n/,\n/g'
4. sort by column
sort -t "," -k 2 -n input.csv
sorted by column 2
5. bash loop in number
for i in $(seq 0 855)
do
date=$(date --date "$i day ago" "+%Y%m%d")
echo "alter table oauth_user_services add if not exists partition (dt = '$date') location '$date';" >> partitions
done
for line in $(cat 2); do echo $line; done;
2. Sed
prefer to use '|' as delimiter if possible:
sed 's|my/home/directory||g' < in > out
in place replacement :
sed -i 's|analytics/etl/maxwell/src/assembly/hive/maxwell/||g' in
3. Sed replace \n to ,\n
sed ':a;N;$!ba;s/\n/,\n/g'
4. sort by column
sort -t "," -k 2 -n input.csv
sorted by column 2
5. bash loop in number
for i in $(seq 0 855)
do
date=$(date --date "$i day ago" "+%Y%m%d")
echo "alter table oauth_user_services add if not exists partition (dt = '$date') location '$date';" >> partitions
done
Friday, March 15, 2013
HBase Lock and Override
HBase lock is like gate keeper.
Before bulkloading, set HBase lock first, then set HBase Override.
After bulkloading, release Override first and then unlock HBase.
Before bulkloading, set HBase lock first, then set HBase Override.
After bulkloading, release Override first and then unlock HBase.
Wednesday, March 13, 2013
Remove leading and tailing spaces of each line
cat input.txt | sed 's/^[ \t]*//;s/[ \t]*$//' > output.txt
Thursday, March 7, 2013
SSH Key
ssh-keygen
With no passphrase
Keep the .ssh permission 700
Keep the .ssh/id_rsa permission 600
With no passphrase
Keep the .ssh permission 700
Keep the .ssh/id_rsa permission 600
Subscribe to:
Posts (Atom)