Monday, April 22, 2013
Read content of HFile via CLI
hbase org.apache.hadoop.hbase.io.hfile.HFile -p -f hdfs://jobs-aa-hnn:8020/data/prod/jobs/hfiles/primaryNetworkProfile/20130415/output/c/d3cc3d77adb8451187be4123a0964062
Wednesday, April 10, 2013
Bash Command
1. cut, rev, uniq, sort
cat 111 | egrep -o "Deleted: /.*/[0-9]{8}" | rev | cut -d "/" -f2- | rev | uniq -c | sort -nr
2. egrep all number and sum up
cat 111 | egrep -o "\[[0-9]+\] bytes" | egrep -o "[0-9]+" | awk '{sum+=$1} END {print sum}'
3. sh bash.sh parameters
$@ means any parameters you passed to the script
4. strace all system call logs of a specific bash command
strace -fvo /home/insights/insights/hive/tt3 -e\!futex -s 8192 bash ./hv
5. For BSD or GNU grep you can use
cat kfb_topic_task1_v4 | awk -F'\t' '{print NF}'
8. redirection doesn't work for sudo
e.g. this won't work if you don't have permission to write the file since sudo won't apply on the redirection
cat 111 | egrep -o "Deleted: /.*/[0-9]{8}" | rev | cut -d "/" -f2- | rev | uniq -c | sort -nr
2. egrep all number and sum up
cat 111 | egrep -o "\[[0-9]+\] bytes" | egrep -o "[0-9]+" | awk '{sum+=$1} END {print sum}'
3. sh bash.sh parameters
$@ means any parameters you passed to the script
4. strace all system call logs of a specific bash command
strace -fvo /home/insights/insights/hive/tt3 -e\!futex -s 8192 bash ./hv
grep " open(" tt3|grep -v ENOENT|grep -v WR|awk -F\" '{print $2}'|sort -u | sed 's/home\/insights/xxx/g'|sed 's/xxx\/insights/yyy/g' | sed 's/yyy-etl-0.3.9-bin/yyy-etl-1.60-cdh4-bin/g' > files.7
5. For BSD or GNU grep you can use
-B num
to set how many lines before the match and -A num
for the number of lines after the match.grep -B 3 -A 2 foo README.txt
If you want the same amount of lines before and after you can use
-C num
.grep -C 3 foo README.txt
6. Copy or paste to clipboard (for mac OS)
pbcopy
pbpaste
7. Print number of fields of each line delimited by '\t'
8. redirection doesn't work for sudo
e.g. this won't work if you don't have permission to write the file since sudo won't apply on the redirection
sudo echo 1 > /proc/sys/vm/overcommit_memory'
To solve this :
sudo sh -c 'echo 1 > /proc/sys/vm/overcommit_memory'
You can also do this easily by : echo 1 | sudo tee /proc/sys/vm/overcommit_memory
You can also do this easily by : echo 1 | sudo tee /proc/sys/vm/overcommit_memory
Friday, April 5, 2013
Regex
re.match(r"^[a-z]+[*]?$", s)
- The
^
matches the start of the string. - The
[a-z]+
matches one or more lowercase letters. - The
[*]?
matches zero or one asterisks. - The
$
matches the end of the string.
Your original regex matches exactly one lowercase character followed by one or more asterisks.
Monday, April 1, 2013
HBase Maintainence Tool
Usage: fsck [opts] {only tables}
where [opts] are:
-help Display help options (this)
-details Display full report of all regions.
-timelag {timeInSeconds} Process only regions that have not experienced any metadata updates in the last {{timeInSeconds} seconds.
-sleepBeforeRerun {timeInSeconds} Sleep this many seconds before checking if the fix worked if run with -fix
-summary Print only summary of the tables and status.
-metaonly Only check the state of ROOT and META tables.
Metadata Repair options: (expert features, use with caution!)
-fix Try to fix region assignments. This is for backwards compatiblity
-fixAssignments Try to fix region assignments. Replaces the old -fix
-fixMeta Try to fix meta problems. This assumes HDFS region info is good.
-fixHdfsHoles Try to fix region holes in hdfs.
-fixHdfsOrphans Try to fix region dirs with no .regioninfo file in hdfs
-fixHdfsOverlaps Try to fix region overlaps in hdfs.
-fixVersionFile Try to fix missing hbase.version file in hdfs.
-maxMerge <n> When fixing region overlaps, allow at most <n> regions to merge. (n=5 by default)
-sidelineBigOverlaps When fixing region overlaps, allow to sideline big overlaps
-maxOverlapsToSideline <n> When fixing region overlaps, allow at most <n> regions to sideline per group. (n=2 by default)
-fixSplitParents Try to force offline split parents to be online.
-ignorePreCheckPermission ignore filesystem permission pre-check
Datafile Repair options: (expert features, use with caution!)
-checkCorruptHFiles Check all Hfiles by opening them to make sure they are valid
-sidelineCorruptHfiles Quarantine corrupted HFiles. implies -checkCorruptHfiles
Metadata Repair shortcuts
-repair Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps
-repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles
Heap
par new generation total 176960K, used 28318K [0x0000000412e00000, 0x000000041ee00000, 0x000000041ee00000)
eden space 157312K, 18% used [0x0000000412e00000, 0x00000004149a7b50, 0x000000041c7a0000)
from space 19648K, 0% used [0x000000041c7a0000, 0x000000041c7a0000, 0x000000041dad0000)
to space 19648K, 0% used [0x000000041dad0000, 0x000000041dad0000, 0x000000041ee00000)
concurrent mark-sweep generation total 5312K, used 0K [0x000000041ee00000, 0x000000041f330000, 0x00000007fae00000)
concurrent-mark-sweep perm gen total 21248K, used 10311K [0x00000007fae00000, 0x00000007fc2c0000, 0x0000000800000000)
Subscribe to:
Posts (Atom)