When using the "hadoop" executable to run HBase programs of any kind,
the right way is to do this:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:'hbase classpath':/usr/lib/hadoop/lib/*:/usr/lib/hbase/lib/*
This will ensure you run with all HBase dependencies loaded on the
classpath, for code to find its HBase-specific resources.
This solves guava dependency errors like:
NoClassDefFoundError: com/google/common/collect/Multimap
When running bulk loading tool
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/lib/hbase/*:/usr/lib/hbase/lib/*:/etc/hbase/conf/
HBASE_JAR=$(ls /usr/lib/hbase/hbase-0.*.*.jar | grep -v test | sort -n -r | head -n 1)
RUN_CMD="hadoop jar $HBASE_JAR completebulkload -conf /etc/hbase/conf/hbase-site.xml "
$RUN_CMD /data/hive/thunder/hfile_user_topic_sets/${DATE} user_topic_sets || exit 1
Or calling method LoadIncrementalHFiles.
Scan with filter on CL
scan 'url_meta', {FILTER => "(RowFilter (=,'regexstring:.*jobs.*') )" , LIMIT=>100}
No comments:
Post a Comment