Skip to content

Commit 08cf055

Browse files
committed
Page on blobs
1 parent 0e3fae6 commit 08cf055

File tree

3 files changed

+142
-1
lines changed

3 files changed

+142
-1
lines changed

docs/source/blobs.rst

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
Blob input/output
2+
=================
3+
4+
`Blob` objects allow binary data to be returned by an Action. This binary data can be passed between Things, or between Things and client code. Using a `Blob` object allows binary data to be efficiently sent over HTTP if required, and allows the same code to run either on the server (without copying the data) or on a client (where data is transferred over HTTP).
5+
6+
If interactions require only simple data types that can easily be represented in JSON, very little thought needs to be given to data types - strings and numbers will be converted to and from JSON automatically, and your Python code should only ever see native Python datatypes whether it's running on the server or a remote client. However, if you want to transfer larger data objects such as images, large arrays or other binary data, you will need to use a `Blob` object.
7+
8+
`Blob` objects are not part of the Web of Things specification, which is most often used with fairly simple data structures in JSON. In LabThings-FastAPI, the `Blob` mechanism is intended to provide an efficient way to work with arbitrary binary data. If it's used to transfer data between two `Thing`s on the same server, the data should not be copied or otherwise iterated over - and when it must be transferred over the network it can be done using a binary transfer, rather than embedding in JSON with base64 encoding.
9+
10+
A `Blob` consists of some data and a MIME type, which sets how the data should be interpreted. It is best to create a subclass of `Blob` with the content type set: this makes it clear what kind of data is in the `Blob`. In the future, it might be possible to add functionality to `Blob` subclasses, for example to make it simple to obtain a `PIL` `Image` object from a `Blob` containing JPEG data. However, this will not yet work across both client and server code.
11+
12+
Creating and using `Blob` objects
13+
------------------------------------------------
14+
15+
Blobs can be created from binary data that is in memory (a `bytes` object), on disk (a file), or using a URL as a placeholder. The intention is that the code that uses a `Blob` should not need to know which of these is the case, and should be able to use the same code regardless of how the data is stored.
16+
17+
Blobs offer three ways to access their data:
18+
19+
* A `bytes` object, obtained via the `data` property. For blobs created with a `bytes` object, this simply returns the original data object with no copying. If the data is stored in a file, the file is opened and read when the `data` property is accessed. If the `Blob` references a URL, it is retrieved and returned when `data` is accessed.
20+
* An `open()` method providing a file-like object. This returns a `BytesIO` wrapper if the `Blob` was created from a `bytes` object or the file if the data is stored on disk. URLs are retrieved, stored as `bytes` and returned wrapped in a `BytesIO` object.
21+
* A `save` method will either save the data to a file, or copy the existing file on disk. This should be more efficient than loading `data` and writing to a file, if the `Blob` is pointing to a file rather than data in memory.
22+
23+
The intention here is that `Blob` objects may be used identically with data in memory or on disk or even at a remote URL, and the code that uses them should not need to know which is the case.
24+
25+
Examples
26+
--------
27+
28+
A camera might want to return an image as a `Blob` object. The code for the action might look like this:
29+
30+
.. code-block:: python
31+
32+
from labthings_fastapi.blob import Blob
33+
from labthings_fastapi.thing import Thing
34+
from labthings_fastapi.decorators import thing_action
35+
36+
class JPEGBlob(Blob):
37+
content_type = "image/jpeg"
38+
39+
class Camera(Thing):
40+
@thing_action
41+
def capture_image(self) -> JPEGBlob:
42+
# Capture an image and return it as a Blob
43+
image_data = self._capture_image() # This returns a bytes object holding the JPEG data
44+
return JPEGBlob.from_bytes(image_data)
45+
46+
The corresponding client code might look like this:
47+
48+
.. code-block:: python
49+
50+
from PIL import Image
51+
from labthings_fastapi.client import ThingClient
52+
53+
camera = ThingClient.from_url("http://localhost:5000/camera/")
54+
image_blob = camera.capture_image()
55+
image_blob.save("captured_image.jpg") # Save the image to a file
56+
57+
# We can also open the image directly with PIL
58+
with image_blob.open() as f:
59+
img = Image.open(f)
60+
img.show() # This will display the image in a window
61+
62+
We could define a more sophisticated camera that can capture raw images and convert them to JPEG, using two actions:
63+
64+
.. code-block:: python
65+
66+
from labthings_fastapi.blob import Blob
67+
from labthings_fastapi.thing import Thing
68+
from labthings_fastapi.decorators import thing_action
69+
70+
class JPEGBlob(Blob):
71+
content_type = "image/jpeg"
72+
73+
class RAWBlob(Blob):
74+
content_type = "image/x-raw"
75+
76+
class Camera(Thing):
77+
@thing_action
78+
def capture_raw_image(self) -> RAWBlob:
79+
# Capture a raw image and return it as a Blob
80+
raw_data = self._capture_raw_image() # This returns a bytes object holding the raw data
81+
return RAWBlob.from_bytes(raw_data)
82+
83+
@thing_action
84+
def convert_raw_to_jpeg(self, raw_blob: RAWBlob) -> JPEGBlob:
85+
# Convert a raw image Blob to a JPEG Blob
86+
jpeg_data = self._convert_raw_to_jpeg(raw_blob.data) # This returns a bytes object holding the JPEG data
87+
return JPEGBlob.from_bytes(jpeg_data)
88+
89+
@thing_action
90+
def capture_image(self) -> JPEGBlob:
91+
# Capture an image and return it as a Blob
92+
raw_blob = self.capture_raw_image() # Capture the raw image
93+
jpeg_blob = self.convert_raw_to_jpeg(raw_blob) # Convert the raw image to JPEG
94+
return jpeg_blob # Return the JPEG Blob
95+
# NB the `raw_blob` is not retained after this action completes, so it will be garbage collected
96+
97+
On the client, we can use the `capture_image` action directly (as before), or we can capture a raw image and convert it to JPEG:
98+
99+
.. code-block:: python
100+
101+
from PIL import Image
102+
from labthings_fastapi.client import ThingClient
103+
104+
camera = ThingClient.from_url("http://localhost:5000/camera/")
105+
106+
# Capture a JPEG image directly
107+
jpeg_blob = camera.capture_image()
108+
jpeg_blob.save("captured_image.jpg")
109+
110+
# Alternatively, capture a raw image and convert it to JPEG
111+
raw_blob = camera.capture_raw_image() # NB the raw image is not yet downloaded
112+
jpeg_blob = camera.convert_raw_to_jpeg(raw_blob)
113+
jpeg_blob.save("converted_image.jpg")
114+
115+
raw_blob.save("raw_image.raw") # Download and save the raw image to a file
116+
117+
118+
Using `Blob` objects as inputs
119+
------------------------------
120+
121+
`Blob` objects may be used as either the input or output of an action. There are relatively few good use cases for `Blob` inputs to actions, but a possible example would be image capture: one action could perform a quick capture of raw data, and another action could convert the raw data into a useful image. The output of the capture action would be a `Blob` representing the raw data, which could be passed to the conversion action.
122+
123+
Because `Blob` outputs are represented in JSON as links, they are downloaded with a separate HTTP request if needed. There is currently no way to create a `Blob` on the server via HTTP, which means remote clients can use `Blob` objects provided in the output of actions but they cannot yet upload data to be used as input. However, it is possible to pass the URL of a `Blob` that already exists on the server as input to a subsequent Action. This means, in the example above of raw image capture, a remote client over HTTP can pass the raw `Blob` to the conversion action, and the raw data need never be sent over the network.
124+
125+
Memory management and retention
126+
-------------------------------
127+
128+
Management of `Blob` objects is currently very basic: when a `Blob` object is returned in the output of an Action that has been called via the HTTP interface, a fixed 5 minute expiry is used. This should be improved in the future to avoid memory management issues.
129+
130+
The behaviour is different when actions are called from other actions. If `action_a` calls `action_b`, and `action_b` returns a `Blob`, that `Blob` will be subject to Python's usual garbage collection rules when `action_a` ends - i.e. it will not be retained unless it is included in the output of `action_a`.
131+
132+
HTTP interface and serialization
133+
-----------------------
134+
135+
`Blob` objects are subclasses of `pydantic.BaseModel`, which means they can be serialized to JSON and deserialized from JSON. When this happens, the `Blob` is represented as a JSON object with two fields: `url` and `content_type`. The `url` field is a link to the data. The `content_type` field is a string representing the MIME type of the data. When a `Blob` is serialized, a URL is generated with a unique ID to allow it to be downloaded. However, only a weak reference is held to the `Blob`. Once an Action has finished running, the only strong reference to the `Blob` should be held by the output property of the action invocation. The `Blob` should be garbage collected once the output is no longer required, i.e. when the invocation is discarded - currently 5 minutes after the action completes, once the maximum number of invocations has been reached or when it is explicitly deleted by the client.

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Documentation for LabThings-FastAPI
99
wot_core_concepts.rst
1010
tutorial/index.rst
1111
dependencies/dependencies.rst
12+
blobs.rst
1213
concurrency.rst
1314
client_code.rst
1415

docs/source/tutorial/index.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,14 @@ LabThings-FastAPI tutorial
55
66
installing_labthings.rst
77
running_labthings.rst
8+
9+
..
10+
In due course, these pages should exist...
811
writing_a_thing.rst
912
client_code.rst
1013
blobs.rst
1114
thing_dependencies.rst
1215
13-
In this tutorial, we'll cover how to start up and interact with a LabThings-FastAPI server, how to write a Thing, and a few more advanced topics. It is intended as an introduction for someone using LabThings-FastAPI and/or writing Thing code to implement a new instrument. It doesn't detail all the internal workings of the library, but we hope this isn't needed for most people.
16+
In this tutorial, we'll cover how to start up and interact with a LabThings-FastAPI server.
17+
18+
In the future, it should include how to write a Thing, and a few more advanced topics. It is intended as an introduction for someone using LabThings-FastAPI and/or writing Thing code to implement a new instrument.

0 commit comments

Comments
 (0)