Skip to content

vrachnis/OpSysII

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a map-reduce program for hadoop calculating the TF-IDF values
for every word in a set of input text files.

This was developed as a part of a school project.

after running `make jar`, to create the inverted index, run:
hadoop jar TfIdf.jar gr.upatras.ceid.romo.Index <input> <output> <title>

to create the tf-idf metrics, run:
hadoop jar TfIdf.jar gr.upatras.ceid.romo.Tf <input> <output> <title>

I hardcoded the number of reducers to 5 according to my system.
You might want to change it to suit your needs.

About

Repo for the Operating Systems II project

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages