-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathREADME
More file actions
25 lines (15 loc) · 1.06 KB
/
README
File metadata and controls
25 lines (15 loc) · 1.06 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
INFO:
This example shows usage of MultipleOutputs feature of Hadoop Map Reduce and custom code to achieve same funcionality, output to multiple files for M/R jobs.
You can read my blog post for more informations: http://vanjakom.wordpress.com/2011/08/09/hadoop-multipleoutputs-feature/
INSTALL:
1. Your hadoop configuration should be linked to /etc/hadoop/conf ( Cloudera distribution is already doing that ) - this can be modified by editing bin/run.sh
2. I'm using CDH3u0 Hadoop distibution, you probably can change this by editing pom
3. do mvn clean package
4. upload data/sample.txt to hdfs ( I will assume hdfs://localhost/user/hadoop/test/sample.txt )
RUN:
For usage of native MultipleOutputs feature do:
bin/run.sh native hdfs://localhost/user/hadoop/test/sample.txt hdfs://localhost/user/hadoop/test/
out_native/
For usage of custom code do:
bin/run.sh custom hdfs://localhost/user/hadoop/test/sample.txt hdfs://localhost/user/hadoop/test/out_custom/
Feel free to contact me if you have any questions via vanjakom@gmail.com, @vanjakom or by leaving blog comment.