Collection of PIG scripts for analysis and rewriting of graphs.
###Dependencies
- PIG[3]
- Hadoop[4]
- Python[5]
- Datafu[1]: run
mvn clean packageto compile, and add it to the PIG classpath - D2S4PIG[2]: run
mvn clean packageto compile, and add it to the PIG classpath
###Howto Our sampling method consists of several steps:
- Rewrite the RDF graph to an unlabelled graph (folder
rewrite). We provide 4 different rewrite methods - Analyze the unlabelled graph using standard network analysis algorithms (folder
analysis) - Rewrite these results back to an RDF graph, where the resource weights are aggregated to triple level (folder
roundtrip)
The pig scripts in each of these folders are wrapped as python scripts. Running a script without arguments will print the documentation for that specific script, and the input/output it provides
####links