Skip to content

Rapture Getting Started Guide

datsclark edited this page Feb 4, 2016 · 3 revisions

ac:layout<ac:layout-section ac:type="single">ac:layout-cell

 


Introduction to your Rapture Information Manager

This guide will introduce you to the Rapture Information Manager (RIM) application which is running on a cloud-based Rapture instance.  The RIM is one part of Rapture Data – the data-lifecycle manager built on Rapture.  You can interact with Rapture through the web interface, or though exposed APIs.

This guide will introduce you to the basic interactions with Rapture, and walk you through the Capture, Curate, Search and Distribute steps that the RIM supports. To help you get started, sample data has been pre-loaded to your Rapture instance. 

 Looking ahead, we intend to extend capabilities of the RIM to support additional functionality within each of these steps. 

Who is this for? 

Retrieving data and making it available for subsequent access is a common data processing problem across industries. The Data Lifecycle Manager built on Rapture makes this easy to achieve in a consistent, robust and visible way. This product will be of interest to users across various domains who are tasked with managing complex data in a controlled and audited manner. 

What are the goals of this Intro? 

  • Develop an understanding of certain key concepts in Rapture, the underlying platform which powers the RIM application
  • Load and transform datasets in your RIM application; learn how to perform these operations in a few different ways: 
    • Server-side Reflex, a powerful scripting language
    • Client-side Java API calls
    • Client-side Python API calls

Additional Information

For further reading on the underlying Rapture platform, please check out additional resources on our Public Wiki.

For sales or other general inquiries, please contact sales@incapturetechnologies.com.


What is in your Rapture instance?

  • Rapture Information Manager
    • Web-based user interface 

    • Basic CRUD operations on Rapture managed data (Documents, Series, Blobs)

    • Reflex REPL (Rapture's server and client side scripting language)

  • Core Rapture API

    • Cloud-based instances of the Rapture API server
    • File based repositories: Document, Series, Blob, Sheet   

Let's get started

  1. Sign up (link to Etienne)

  2. Receive email with URL, User, Pass

  3. Login

 


Getting to know Rapture

Using Rapture Web to see Apps

Browse to rapture using the link in the email provided.

Login using the login provided in the email provided.

 

Default Login
Username: { as you selected }
Password: { temporary password from email }

 

 


Using Rapture Web to browse Data

When you open RCM you will be presented with a list of all Repositories which hold all Data in Rapture

As you browse this data, keep in mind a few key concepts. 

Repository: A place to store information. A repository has a name (unique in a given Rapture instance) and an underlying implementation. The idea is that application developers interact with a repository using a consistent API and Rapture takes care of the details of how to manage the information in the underlying implementation. The implementation in this case refers to a database system and the systems supported currently cover a wide range of technologies from traditional relational databases to the newer “distributed key-value stores”. There are a few major types of repositories: 


Document Repository: Used to store structured data indexed by a unique text key; usually formatted as a JSON structure. In this trial product, the document repository is implemented on MongoDB running on Amazon EC2. 

Blob Repository: Binary data without any other classifications. It is useful for storing items such as images or PDF data.

 TBD

Series Repository: A keyed list of data points. In most applications, the key is a timestamp, and therefore the series represents how the data changes with time. The data points within a series can be simple types, such as numbers, or they can be data structures. 

TBD


Your First Project


 

Before beginning this demonstration of Rapture it is expected that your environment satisfies the following pre-requisites

  1. git (https://git-scm.com/downloads)
  2. Java 8 (or later) runtime is installed and accessible
  3. Python 2.7.10 (or later) is installed with the following python modules installed:
    1. requests (https://pypi.python.org/pypi/requests)
    2. numpy (https://pypi.python.org/pypi/numpy)

Download Files


TBD:

git pull incapture/rapturedemo or something.

 

 


Create new Repo

 

  1. Blob / Mongo
  2. Doc  / Redis
  3. Series / Cassandra

     

For the Tutorial:

Blob: Mongo: turtorialBlob

Doc: Mongo: tutorialDoc


Language guides

 

Run upload script to load CSV into Blob

Run translate script to create Doc

Run analyze script to create/update Series
Run report script to create and view the report

 

Reflex/UI Instructions Python Instructions JAVA Instructions
 
   

Optional: Reset Data

Steps to delete the added data so tutorials can be run from the beginning:

  1. Copy the following Reflex script code into the Reflex Repl screen.  The Reflex Repl screen can be accessed by clicking the ">__" icon in the top upper left of the RIM web-application (3 icons from the left).

    // start date for the series column which we will be deleting from startDateInclusive = "20141028";

    // get a map from uri to RaptureFolderInfo object uris = #series.listSeriesByUriPrefix("series://datacapture/HIST/Provider_1a", 3); // convert it so we can access its fields uriMap = fromjson(json(uris)); for k in keys(uriMap) do v = uriMap[k]; // if its a folder, we skip it if v['folder'] == false do println("Cleaning series: " + k); // get the points after a starting point pts = #series.getPointsAfter(k, startDateInclusive, 10000); columnsToRemove = []; for pt in pts do ptMap = fromjson(json(pt)); columnsToRemove += ptMap['column']; end // finally delete the series points using our list of column keys #series.deletePointsFromSeriesByPointKey(k, columnsToRemove); end end

    // delete the doc repo #doc.deleteDocRepo("document://tutorialDoc"); // delete the blob repo #blob.deleteBlobRepo("blob://turtorialBlob");

    println("Done!");]]></ac:plain-text-body></ac:structured-macro>

2. Click return and everything will be reset back to its original starting point before the tutorials were run.

Reflex is a scripting language developed at Incapture Technologies. It is designed for performing cloud-based data manipulation. Reflex takes very little resource overhead, and therefore handles the data interactions more efficiently than JRuby, Jython, or any other standard scripting tool. We will demonstrate how to perform various operations using Reflex; for those interested in learning more, please check out the .