Skip to content
This repository was archived by the owner on Jan 10, 2026. It is now read-only.

Latest commit

Β 

History

History
591 lines (432 loc) Β· 15.8 KB

File metadata and controls

591 lines (432 loc) Β· 15.8 KB

RAIL Score Python SDK

PyPI version Python Versions Downloads Downloads per Month License: MIT

GitHub Stars GitHub Forks GitHub Issues GitHub Pull Requests

Build Status Documentation Status Code style: black Typing: mypy

Responsible AI Research Paper

Evaluate and generate responsible AI content with the official Python client for RAIL Score API

Documentation β€’ API Reference β€’ Examples β€’ Report Issues


🌟 Features at a Glance

Feature Description
🎯 8 RAIL Dimensions Evaluate content across Reliability, Accountability, Interpretability, Legal Compliance, Safety, Privacy, Transparency, and Fairness
⚑ Multiple Evaluation Tiers Choose from basic, dimension-specific, custom, weighted, detailed, advanced, and batch evaluation
πŸ€– AI Generation Generate RAG-grounded responses, reprompt suggestions, and protected content
βœ… Compliance Checks Built-in support for GDPR, HIPAA, CCPA, and EU AI Act compliance
πŸ“Š Batch Processing Evaluate up to 100 items per request efficiently
πŸ”’ Type-Safe Full typing support with structured dataclasses for better IDE experience
πŸ”„ Auto-Retry Built-in error handling and automatic retries
πŸ“ˆ Usage Tracking Monitor credits, usage history, and API health

πŸš€ Quick Start

Installation

pip install rail-score

Basic Usage

from rail_score import RailScore

# Initialize client
client = RailScore(api_key="your-rail-api-key")

# Evaluate content
result = client.evaluation.basic("Our AI system ensures user privacy and data security.")

# Access scores
print(f"Overall RAIL Score: {result.rail_score.score}")
print(f"Confidence: {result.rail_score.confidence}")
print(f"Privacy Score: {result.scores['privacy'].score}")
print(f"Credits Used: {result.metadata.credits_consumed}")

πŸ“– Table of Contents


πŸ”§ Configuration

from rail_score import RailScore

client = RailScore(
    api_key="your-rail-api-key",
    base_url="https://api.responsibleailabs.ai",  # Optional
    timeout=60  # Request timeout in seconds
)

Getting an API Key: Visit responsibleailabs.ai to sign up and get your API key.


πŸ“Š Evaluation API

Basic Evaluation

Evaluate content across all 8 RAIL dimensions:

result = client.evaluation.basic(
    content="Your AI-generated content here",
    weights=None  # Optional custom weights
)

# Access results
print(result.rail_score.score)  # Overall score (0-10)
print(result.rail_score.confidence)  # Confidence (0-1)

# Individual dimensions
for dim_name, dim_score in result.scores.items():
    print(f"{dim_name}: {dim_score.score} (confidence: {dim_score.confidence})")
    print(f"  Explanation: {dim_score.explanation}")
    if dim_score.issues:
        print(f"  Issues: {', '.join(dim_score.issues)}")

# Metadata
print(f"Request ID: {result.metadata.req_id}")
print(f"Credits Used: {result.metadata.credits_consumed}")
print(f"Processing Time: {result.metadata.processing_time_ms}ms")

Dimension-Specific Evaluation

Evaluate on one specific dimension only:

result = client.evaluation.dimension(
    content="We collect user data with consent",
    dimension="privacy"  # One of: reliability, accountability, interpretability,
                         # legal_compliance, safety, privacy, transparency, fairness
)

print(result['result']['score'])
print(result['result']['explanation'])

Custom Evaluation

Evaluate only specific dimensions:

result = client.evaluation.custom(
    content="Healthcare AI system",
    dimensions=["safety", "privacy", "reliability"],
    weights={"safety": 40, "privacy": 35, "reliability": 25}
)

print(result.rail_score.score)
print(result.scores.keys())  # Only evaluated dimensions

Weighted Evaluation

Custom dimension weights:

weights = {
    "safety": 30,
    "privacy": 25,
    "reliability": 20,
    "accountability": 15,
    "transparency": 5,
    "fairness": 3,
    "inclusivity": 1,
    "user_impact": 1
}

result = client.evaluation.weighted("Content here", weights=weights)

Detailed Evaluation

Get detailed breakdown with strengths and weaknesses:

result = client.evaluation.detailed("AI model description")

summary = result['result']['summary']
print(f"Strengths: {summary['strengths']}")
print(f"Weaknesses: {summary['weaknesses']}")
print(f"Improvements needed: {summary['improvements_needed']}")

Advanced Evaluation

Ensemble evaluation with higher confidence:

result = client.evaluation.advanced(
    content="Critical AI system",
    context="Healthcare decision support system"  # Optional
)

print(result.rail_score.confidence)  # Typically 0.90+

Batch Evaluation

Evaluate multiple items in one request:

items = [
    {"content": "First AI-generated text"},
    {"content": "Second AI-generated text"},
    {"content": "Third AI-generated text"}
]

result = client.evaluation.batch(
    items=items,
    dimensions=["safety", "privacy", "fairness"],
    tier="balanced"  # "fast", "balanced", or "advanced"
)

print(f"Processed: {result.successful}/{result.total_items}")

for item_result in result.results:
    print(f"Score: {item_result.rail_score.score}")

RAG Evaluation

Evaluate RAG responses for hallucinations:

result = client.evaluation.rag_evaluate(
    query="What is the capital of France?",
    response="The capital of France is Paris.",
    context_chunks=[
        {"content": "Paris is the capital city of France."},
        {"content": "France is a country in Western Europe."}
    ]
)

metrics = result['result']['rag_metrics']
print(f"Hallucination Score: {metrics['hallucination_score']}")  # Lower is better
print(f"Grounding Score: {result['result']['grounding_score']}")  # Higher is better
print(f"Overall Quality: {metrics['overall_quality']}")

πŸ€– Generation API

RAG Chat

Generate context-grounded responses:

result = client.generation.rag_chat(
    query="What are the benefits of GDPR compliance?",
    context="GDPR provides data protection and privacy rights to EU citizens...",
    max_tokens=300,
    model="gpt-4o-mini"
)

print(result.generated_text)
print(f"Tokens used: {result.usage['total_tokens']}")
print(f"Credits: {result.metadata.credits_consumed}")

Reprompting

Get improvement suggestions:

current_scores = {
    "transparency": {"score": 4.5},
    "accountability": {"score": 5.0}
}

result = client.generation.reprompt(
    content="AI makes decisions automatically",
    current_scores=current_scores,
    target_score=8.0,
    focus_dimensions=["transparency", "accountability"]
)

suggestions = result['result']['improvement_suggestions']
print(suggestions['text_replacements'])
print(suggestions['expected_improvements'])

Protected Generation

Generate content with safety filters:

result = client.generation.protected_generate(
    prompt="Write a description for an AI hiring tool",
    max_tokens=200,
    min_rail_score=8.0
)

print(result.generated_text)
print(f"RAIL Score: {result.rail_score}")
print(f"Safety Passed: {result.safety_passed}")

βœ… Compliance API

GDPR Compliance

result = client.compliance.gdpr(
    content="We collect user emails for marketing purposes",
    context={"data_type": "personal", "region": "EU"},
    strict_mode=True  # Use 7.5 threshold instead of 7.0
)

print(f"Compliance Score: {result.compliance_score}")
print(f"Passed: {result.passed}/{result.requirements_checked}")

for req in result.requirements:
    print(f"{req.requirement} ({req.article}): {req.status}")
    if req.status == "FAIL":
        print(f"  Issue: {req.issue}")

Other Compliance Checks

# CCPA
result = client.compliance.ccpa("Content here")

# HIPAA
result = client.compliance.hipaa("Healthcare AI system")

# EU AI Act
result = client.compliance.ai_act("AI system description")

πŸ› οΈ Utilities

Check Credits

credits = client.get_credits()

print(f"Plan: {credits['plan']}")
print(f"Monthly Limit: {credits['credits']['monthly_limit']}")
print(f"Used This Month: {credits['credits']['used_this_month']}")
print(f"Remaining: {credits['credits']['remaining']}")

Get Usage History

usage = client.get_usage(limit=50, from_date="2025-01-01T00:00:00Z")

print(f"Total records: {usage['total_records']}")
print(f"Total credits used: {usage['total_credits_used']}")

for entry in usage['history']:
    print(f"{entry['timestamp']}: {entry['endpoint']} - {entry['credits_used']} credits")

Health Check

health = client.health_check()

print(f"Status: {health['ok']}")
print(f"Version: {health['version']}")

⚠️ Error Handling

from rail_score import (
    RailScore,
    AuthenticationError,
    InsufficientCreditsError,
    ValidationError,
    RateLimitError,
    PlanUpgradeRequired
)

client = RailScore(api_key="your-api-key")

try:
    result = client.evaluation.basic("Your content")
except AuthenticationError:
    print("Invalid API key")
except InsufficientCreditsError as e:
    print(f"Not enough credits. Balance: {e.balance}, Required: {e.required}")
except ValidationError as e:
    print(f"Invalid parameters: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded. Retry after: {e.retry_after} seconds")
except PlanUpgradeRequired:
    print("This endpoint requires a Pro or higher plan")

πŸ“¦ Response Structure

All endpoints return responses with this structure:

{
  "result": {
    "rail_score": {"score": 8.7, "confidence": 0.90},
    "scores": {
      "privacy": {"score": 9.1, "confidence": 0.94, "explanation": "..."},
      ...
    },
    "processing_time": 2.5
  },
  "metadata": {
    "req_id": "abc-123",
    "tier": "pro",
    "queue_wait_time_ms": 1200.0,
    "processing_time_ms": 2500.0,
    "credits_consumed": 2.0,
    "timestamp": "2025-11-03T10:30:00Z"
  }
}

πŸ’‘ Use Cases

Content Moderation

from rail_score import RailScore

client = RailScore(api_key="your-key")

# Check user-generated content for safety
result = client.evaluation.dimension(
    content="User comment here",
    dimension="safety"
)

if result['result']['score'] < 7.0:
    print("Content flagged for review")
    print(f"Issues: {result['result']['issues']}")

Batch Content Evaluation

# Evaluate multiple pieces of content
items = [{"content": text} for text in content_list]

result = client.evaluation.batch(
    items=items[:100],  # Max 100 items
    dimensions=["safety", "fairness", "privacy"]
)

# Filter by score
safe_content = [
    items[i]
    for i, res in enumerate(result.results)
    if res.rail_score.score >= 7.5
]

Compliance Checking

# Check GDPR compliance
result = client.compliance.gdpr(
    content="AI system for user profiling",
    context={"purpose": "marketing", "data_type": "personal"}
)

if result.failed > 0:
    print("GDPR compliance issues found:")
    for req in result.requirements:
        if req.status == "FAIL":
            print(f"- {req.requirement}: {req.issue}")

πŸ”¨ Development

Requirements

  • Python 3.8+
  • requests >= 2.28.0

Setup

# Clone repository
git clone https://github.com/Responsible-AI-Labs/rail-score.git
cd rail-score

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black rail_score/

# Type checking
mypy rail_score/

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Quick Links


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ“ž Support


🌐 Related Resources


⭐ Star History

If you find RAIL Score useful, please consider giving it a star! ⭐

Star History Chart


Made with ❀️ by Responsible AI Labs

Website β€’ Documentation β€’ GitHub β€’ Twitter