Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,7 @@ nosetests.xml
htmlcov
coverage.xml

docs/_build
docs/_build

# Virtual Environment
venv/
1 change: 0 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
language: python
python:
- '2.7'
- '3.2'
- '3.3'
install:
- pip install .
Expand Down
21 changes: 18 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. image:: https://travis-ci.org/SpazioDati/python-dandelion-eu.png?branch=develop
:target: https://travis-ci.org/SpazioDati/python-dandelion-eu

.. image:: https://coveralls.io/repos/SpazioDati/python-dandelion-eu/badge.png?branch=develop
.. image:: https://coveralls.io/repos/SpazioDati/python-dandelion-eu/badge.png?branch=master
:target: https://coveralls.io/r/SpazioDati/python-dandelion-eu?branch=develop

..
Expand All @@ -16,13 +16,28 @@

.. _PyPI: https://pypi.python.org/pypi/dandelion-eu/
.. _ReadTheDocs: http://python-dandelion-eu.readthedocs.org/

.. _dandelion: https://dandelion.eu/accounts/register/?next=/
.. _dandelion.eu: http://dandelion.eu/

python-dandelion-eu
===================

Connect to the dandelion.eu API in a very pythonic way!
Bring the power of the dandelion.eu_ Datagem, DataTXT and Sentiment API to your python applications and scripts!
Semantic in python couldn't be easier.


.. code-block:: py

>>> from dandelion import DataTXT
>>> datatxt = DataTXT(token='YOUR_TOKEN')
>>> response = datatxt.nex('The doctor says an apple is better than an orange')
>>> for annotation in response.annotations:
print annotation
...

Register on dandelion_ to obtain your authentication token and enrich your application with our semantic intelligence.

NOTE: the client still supports the legacy authentication system through ``app_id`` and ``app_key``.

Installation
------------
Expand Down
2 changes: 2 additions & 0 deletions dandelion/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
from dandelion.base import DandelionException, DandelionConfig
from dandelion.datagem import Datagem
from dandelion.datatxt import DataTXT
from dandelion.sentiment import Sentiment
from dandelion.base import AttributeDict

__version__ = '0.2.2'
default_config = DandelionConfig()
75 changes: 46 additions & 29 deletions dandelion/base.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,26 @@
""" base classes
"""
from __future__ import unicode_literals

import requests

from dandelion.cache.base import NoCache
from dandelion.utils import AttributeDict

try:
import urlparse
except ImportError:
from urllib import parse as urlparse

from dandelion.cache.base import NoCache
from dandelion.utils import AttributeDict


class DandelionConfig(dict):
""" class for storing the default dandelion configuration, such
as authentication parameters
"""
ALLOWED_KEYS = ['app_id', 'app_key']
ALLOWED_KEYS = ['token', 'app_id', 'app_key']

def __setitem__(self, key, value):
if not key in self.ALLOWED_KEYS:
if key not in self.ALLOWED_KEYS:
raise DandelionException('invalid config param: {}'.format(key))
super(DandelionConfig, self).__setitem__(key, value)

Expand All @@ -32,43 +35,57 @@ def __init__(self, dandelion_obj=None, **kwargs):
self.data = dandelion_obj.data
else:
self.message = "{}".format(dandelion_obj)
self.code = kwargs.get('code')
self.data = kwargs.get('data')
super(DandelionException, self).__init__(self.message)


class MissingParameterException(DandelionException):
code = 'error.missingParameter'

def __init__(self, param_name):
self.data = {'parameter': param_name}
super(MissingParameterException, self).__init__(
'Param "{}" is required'.format(param_name)
)


class BaseDandelionRequest(object):
DANDELION_HOST = 'api.dandelion.eu'
REQUIRE_AUTH = True

def __init__(self, **kwargs):
import requests
def __init__(self, host=None, cache=NoCache(), token=None, app_id=None, app_key=None, **kwargs):
from dandelion import default_config
self.uri = self._get_uri(host=kwargs.get('host'))
self.app_id = kwargs.get('app_id', default_config.get('app_id'))
self.app_key = kwargs.get('app_key', default_config.get('app_key'))
self.uri = self._get_uri(host=host)
self.requests = requests.session()
self.cache = kwargs.get('cache', NoCache())
self.cache = cache

if self.REQUIRE_AUTH and not self.app_id:
raise MissingParameterException("app_id")
if self.REQUIRE_AUTH and not self.app_key:
raise MissingParameterException("app_key")
if self.REQUIRE_AUTH:
self.auth = ''

if not self._check_authentication_parameters(token, app_id, app_key, ''):
token = default_config.get('token')
app_id = default_config.get('app_id')
app_key = default_config.get('app_key')

if not self._check_authentication_parameters(token, app_id, app_key, ' (in default config)'):
raise DandelionException('You have to specify the authentication token OR the app_id and app_key!')

def _check_authentication_parameters(self, token, app_id, app_key, mode):
if token:
if not app_id and not app_key:
self.auth = 'token'
self.token = token
return True
else:
raise DandelionException('Too many authentication parameters'+mode+', you have to specify \'token\' OR \'app_id\' and \'app_key\'!')
else:
if app_id and app_key:
self.auth = 'legacy'
self.app_id = app_id
self.app_key = app_key
return True
elif app_id or app_key:
raise DandelionException('To use the legacy authentication system you have to specify both \'app_id\' and \'app_key\''+mode+'!')
return False

def do_request(self, params, extra_url='', method='post', **kwargs):
if self.REQUIRE_AUTH:
params['$app_id'] = self.app_id
params['$app_key'] = self.app_key
if self.auth == 'token':
params['token'] = self.token
elif self.auth == 'legacy':
params['$app_id'] = self.app_id
params['$app_key'] = self.app_key
else:
raise DandelionException('Error in authentication mechanism!')

url = self.uri + ''.join('/' + x for x in extra_url)

Expand Down
2 changes: 1 addition & 1 deletion dandelion/cache/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ def get_key_for(**kwargs):
import six
input_s = ''
for key in sorted(kwargs):
input_s += '{}={},'.format(key, kwargs[key])
input_s += u'{}={},'.format(key, kwargs[key])
if isinstance(input_s, six.text_type):
input_s = input_s.encode('utf-8')
return hashlib.sha1(input_s).hexdigest()
8 changes: 6 additions & 2 deletions dandelion/datagem.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import warnings

from dandelion.base import DandelionException, BaseDandelionRequest
from dandelion.base import BaseDandelionRequest, DandelionException


class Datagem(BaseDandelionRequest):
Expand Down Expand Up @@ -72,7 +72,11 @@ def get(self, **kwargs):
raise DandelionException('The requested item does not exist')

def select(self, *args):
self.params['$select'] = ','.join(args)
if '$select' not in self.params or self.params['$select'] == '':
self.params['$select'] = ','.join(args)
elif args:
self.params['$select'] = self.params['$select']+','+(','.join(args))

if any(param.startswith('count(') for param in args):
self.params['$group'] = ','.join(
param for param in args if not param.startswith('count(')
Expand Down
18 changes: 14 additions & 4 deletions dandelion/datatxt.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,24 @@
""" classes for querying the dataTXT family
"""
from dandelion.base import BaseDandelionRequest
from dandelion.base import BaseDandelionRequest, DandelionException


class DataTXT(BaseDandelionRequest):
""" class for accessing the dataTXT family
"""
def nex(self, text, **params):
if 'min_confidence' not in params:
params['min_confidence'] = 0.6
def nex(self, text, top_entities=None, min_confidence=None, **params):
if top_entities is not None:
if not isinstance(top_entities, (int, long)) or top_entities < 0:
raise DandelionException('The \'top-entities\' parameter must be an integer greater than or equal to 0')
else:
params['top_entities'] = top_entities

if min_confidence is not None:
if not isinstance(min_confidence, float) or min_confidence < 0.0 or min_confidence > 1.0:
raise DandelionException('The \'top-entities\' parameter must be a float between 0.0 and 1.0')
else:
params['min_confidence'] = min_confidence

return self.do_request(
dict(params, text=text), ('nex', 'v1')
)
Expand Down
21 changes: 21 additions & 0 deletions dandelion/sentiment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
""" classes for querying the Sentiment API
"""
from dandelion.base import BaseDandelionRequest, DandelionException


class Sentiment(BaseDandelionRequest):
""" class for accessing the Sentiment API
"""
def sent(self, text, lang=None, **params):
if lang is not None:
if lang not in ['en', 'it', 'auto']:
raise DandelionException('Illegal \'lang\' parameter value!')
else:
params['lang'] = lang

return self.do_request(
dict(params, text=text), ('sent', 'v1')
)

def _get_uri_tokens(self):
return 'datatxt',
38 changes: 33 additions & 5 deletions docs/base.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,26 +10,54 @@ For specific documentation on each service, please refer to their page.
Authentication
--------------
Most (all?) of the dandelion.eu_ services require authentication. You can
find your authentication keys on your dashboard_ and pass them to the class
find your authentication token on your dashboard_ and pass it to the class
constructor, for example::

>>> from dandelion import Datagem
>>> administrative_regions = Datagem('administrative-regions',
... app_id='24cxxxx',
... app_key='8697xxxx8b99xxxxeecbxxxxb163xxxx')
... token='7682xxxxxeh2nb2v2mxxxxxxxjh9sbxxxx')


If you need to instantiate more services, you can specify your authentication
keys just once using ``dandelion.default_config``::
token just once using ``dandelion.default_config`` ::

>>> from dandelion import default_config
>>> default_config['token'] = '7682xxxxxeh2nb2v2mxxxxxxxjh9sbxxxx'

>>> from dandelion import DataTXT, Datagem, Sentiment
>>> datatxt = DataTXT()
>>> administrative_regions = Datagem('administrative-regions')
>>> sentiment = Sentiment()

Legacy authentication system
----------------------------
The client still supports authentication through ``$app_id`` and ``$app_key`` .
The use is the same as for the token: you can pass ``app_id`` and ``app_key``
to the class contructor or specify them once using ``dandelion.default_config`` ::

>>> from dandelion import Datagem
>>> administrative_regions = Datagem('administrative-regions',
... app_id='24cxxxx',
... app_key='8697xxxx8b99xxxxeecbxxxxb163xxxx')

OR

>>> from dandelion import default_config
>>> default_config['app_id'] = '24cxxxx'
>>> default_config['app_key'] = '8697xxxx8b99xxxxeecbxxxxb163xxxx'

>>> from dandelion import DataTXT, Datagem
>>> from dandelion import DataTXT, Datagem, Sentiment
>>> datatxt = DataTXT()
>>> administrative_regions = Datagem('administrative-regions')
>>> sentiment = Sentiment()

Notes on authentication
-----------------------
You have to specify the ``token`` OR the pair (``app_id``, ``app_key``): if you specify
them both, the client will raise an exception.

Moreover, the client will use what is stored in ``default_config`` only if you
don't pass anything to the class constructor.

Caching your queries
--------------------
Expand Down
6 changes: 4 additions & 2 deletions docs/datagem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ Get entities
Retrieving entities from dandelion is easy, just instantiate a datagem and iterate; pagination is implemented automatically for you, so don't worry and just get data::

>>> from dandelion import Datagem
>>> d = Datagem('administrative-regions')
>>> d = Datagem('administrative-regions',
... token='7682xxxxxeh2nb2v2mxxxxxxxjh9sbxxxx')
>>> for obj in d.items[:10]:
... print(obj.acheneID)
...
Expand All @@ -31,7 +32,8 @@ Select fields
If you want to reduce the network load, you can retrieve only the fields you will actually use with ``select``::

>>> from dandelion import Datagem
>>> d = Datagem('administrative-regions')
>>> d = Datagem('administrative-regions',
... token='7682xxxxxeh2nb2v2mxxxxxxxjh9sbxxxx')
>>> for obj in d.items.select('acheneID')[:10]:
... print(obj)
...
Expand Down
14 changes: 6 additions & 8 deletions docs/datatxt.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
:title:
The dataTXT API

.. _SpazioDati: http://www.spaziodati.eu
.. _check here which ones are supported: https://dandelion.eu/docs/api/datatxt/nex/v1/#param-lang
.. _dataTXT-NEX documentation on dandelion.eu: https://dandelion.eu/docs/api/datatxt/nex/v1/
.. _dataTXT-SIM documentation on dandelion.eu: https://dandelion.eu/docs/api/datatxt/sim/v1/
.. _dataTXT-LI documentation on dandelion.eu: https://dandelion.eu/docs/api/datatxt/li/v1/
Expand All @@ -13,16 +11,16 @@ dataTXT is a family of semantic services developed by SpazioDati_. All its
methods are available in the same class::

>>> from dandelion import DataTXT
>>> datatxt = DataTXT(app_id='', app_key='')
>>> datatxt = DataTXT(token='')


NEX: Named Entity Extraction
----------------------------
dataTXT-NEX is a named entity extraction & linking API that performs very well
even on short texts, on which many other similar services do not. dataTXT-NEX
currently works on Italian and English texts. With this API you will be able
to automatically tag your texts, extracting Wikipedia entities and enriching
your data.
currently works on various languages (`check here which ones are supported`_).
With this API you will be able to automatically tag your texts, extracting Wikipedia
entities and enriching your data.

You can extract annotated entities with::

Expand All @@ -42,7 +40,7 @@ Additional parameters can be specified simply by::
'http://dbpedia.org/resource/Open_source']

Check out the `dataTXT-NEX documentation on dandelion.eu`_ for more information
about what can be done with NEX.
about what can be done with NEX (in particular see the ``top_entities`` parameter!).


SIM: Text Similarity
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@ Contents
base
datatxt
datagem
sentiment
Loading