Data Base Processing

This repository contains the following files to solve 2 problems of Data Base Processing: in batch and in streaming with Spark and we are using Scala like programming language.

The files in this repository are:

in/
interstelar/
interstelar.pdf
README.md

1. in/

The file in/ contains all data base that we are using .

The data files are:

historico_batch.csv
naves_transporte.csv
trayectos

With these data base we are going to simulate the processes: batch and streaming.

2. interstelar/

This file contains the following files with scala extension:

KafkaConsumoMedio.scala
KafkaDifConsumo.scala
ListaTresMejores.scala
MediasConsumosBatch.scala

Every Scala file has a code to explain a part of every problem that we want to solve in the file interstelar.pdf

3. interstelar.pdf

This file contains all the questions that we will use to answer some questions:

Batch processing (to solve these questions we are using Spark SQL):

Intake of data stored for years by navigation systems spacecrafts and docking ports (mode batch): we answer this question with the file MediasConsumosBatch.scala
Data cleaning: we answer this question with the file MediasConsumosBatch.scala
Calculation of means of consumption of all the spacecrafts of the fleet grouped by spacecraft (every spacecraft has an identifier):we answer this question with the file MediasConsumosBatch.scala

Streaming process (to solve these questions we are using Spark Streamining and a Kafka machine):

Real time consumption data (Spark Streaming): we answer this question with the file KafkaConsumoMedio.scala
Calculation of means of consumption of all the spacecrafts of the fleet grouped by spacecraft (every spacecraft has an identifier) obtained in real time: we answer this question with the file KafkaConsumoMedio.scala
Process on both datasets obtaining the difference between average consumption: we answer this question with the file KafkaDifConsumo.scala
Obtaining a collection (List) of tuple elements (identification spacecraft and model) with the three best transports: we answer this question with the file ListaTresMejores.scala

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
in		in
interstelar		interstelar
Interestelar.pdf		Interestelar.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Base Processing

1. in/

2. interstelar/

3. interstelar.pdf

About

Uh oh!

Releases

Packages

Languages

ceblfe/Data_Base_Processing_KC

Folders and files

Latest commit

History

Repository files navigation

Data Base Processing

1. in/

2. interstelar/

3. interstelar.pdf

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages