Interface BinaryUpload
-
@ProviderType public interface BinaryUpload
Describes uploading a binary through HTTP requests in a single or multiple parts. This will be returned byJackrabbitValueFactory.initiateBinaryUpload(long, int)
. A high-level overview of the process can be found inJackrabbitValueFactory
.Note that although the API allows URI schemes other than "http(s)", the upload functionality is currently only defined for HTTP.
A caller usually needs to pass the information provided by this interface to a remote client that is in possession of the actual binary, who then has to upload the binary using HTTP according to the logic described below. A remote client is expected to support multi-part uploads as per the logic described below, in case multiple URIs are returned.
Once a remote client finishes uploading the binary data, the application must be notified and must then call
JackrabbitValueFactory.completeBinaryUpload(String)
to complete the upload. This completion requires the exact upload token obtained fromgetUploadToken()
.Upload algorithm
A remote client will have to follow this algorithm to upload a binary based on the information provided by this interface.Please be aware that if the size passed to
JackrabbitValueFactory.initiateBinaryUpload(long, int)
was an estimation, but the actual binary is larger, there is no guarantee the upload will be possible using allgetUploadURIs()
and thegetMaxPartSize()
. In such cases, the application should restart the transaction using the correct size.Variables used
fileSize
: the actual binary size (must be known at this point)minPartSize
: the value fromgetMinPartSize()
maxPartSize
: the value fromgetMaxPartSize()
numUploadURIs
: the number of entries ingetUploadURIs()
uploadURIs
: the entries ingetUploadURIs()
partSize
: the part size to be used in the upload (to be determined in the algorithm)
Steps
-
If
(fileSize / maxPartSize) > numUploadURIs
, then the client cannot proceed and will have to request a new set of URIs with the right fileSize asmaxSize
. -
Calculate the
partSize
and the number of URIs to use.
The easiest way to do this is to use themaxPartSize
as the value forpartSize
. As long as the size of the actual binary upload is less than or equal to the size passed toJackrabbitValueFactory.initiateBinaryUpload(long, int)
, a non-null BinaryUpload object returned from that call means you are guaranteed to be able to upload the binary successfully, using the provideduploadURIs
, so long as the value you use forpartSize
ismaxPartSize
. Note that it is not required to use of all the URIs provided inuploadURIs
if not all URIs are required to upload the entire binary with the selectedpartSize
.
However, there are some exceptions to consider:-
If
fileSize < minPartSize
, then take the first provided upload URI to upload the entire binary, withpartSize = fileSize
. Note that it is not required to use all of the URIs provided inuploadURIs
. -
If
fileSize / partSize == numUploadURIs
, all part URIs must to be used. ThepartSize
to use for all parts except the last would be calculated using:partSize = (fileSize + numUploadURIs - 1) / numUploadURIs
It is also possible to simply usemaxPartSize
as the value forpartSize
in this case, for every part except the last.
partSize
, for example if the client has more information about the conditions of the network or other information that would make a differentpartSize
preferable. In this case a different value may be chosen, under the condition that all of the following are true:partSize >= minPartSize
partSize <= maxPartSize
(unlessmaxPartSize = -1
meaning unlimited)partSize > (fileSize / numUploadURIs)
-
If
-
Upload: segment the binary into
partSize
, for each segment take the next URI fromuploadURIs
(strictly in order), proceed with a standard HTTP PUT for each, and for the last part use whatever segment size is left. - If a segment fails during upload, retry (up to a certain timeout).
-
After the upload has finished successfully, notify the application,
for example through a complete request, passing the
upload token
, and the application will callJackrabbitValueFactory.completeBinaryUpload(String)
with the token.
The only timeout restrictions for callingJackrabbitValueFactory.completeBinaryUpload(String)
are those imposed by the cloud blob storage service on uploaded blocks. Upload tokens themselves do not time out, which allows you to be very lenient in allowing uploads to complete, and very resilient in handling temporary network issues or other issues that might impact the uploading of one or more blocks.
In the case that the upload cannot be finished (for example, one or more segments cannot be uploaded even after a reasonable number of retries), do not callJackrabbitValueFactory.completeBinaryUpload(String)
. Instead, simply restart the upload from the beginning by callingJackrabbitValueFactory.initiateBinaryUpload(long, int)
when the situation preventing a successful upload has been resolved.
Example JSON view
A JSON representation of this interface as passed back to a remote client might look like this:{ "uploadToken": "aaaa-bbbb-cccc-dddd-eeee-ffff-gggg-hhhh", "minPartSize": 10485760, "maxPartSize": 104857600, "uploadURIs": [ "http://server.com/upload/1", "http://server.com/upload/2", "http://server.com/upload/3", "http://server.com/upload/4" ] }
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description long
getMaxPartSize()
Return the largest possible part size in bytes.long
getMinPartSize()
Return the smallest possible part size in bytes.@NotNull java.lang.String
getUploadToken()
Returns a token identifying this upload.@NotNull java.lang.Iterable<java.net.URI>
getUploadURIs()
Returns a list of URIs that can be used for uploading binary data directly to a storage location in one or more parts.
-
-
-
Method Detail
-
getUploadURIs
@NotNull @NotNull java.lang.Iterable<java.net.URI> getUploadURIs()
Returns a list of URIs that can be used for uploading binary data directly to a storage location in one or more parts.Remote clients must support multi-part uploading as per the upload algorithm described above. Clients are not necessarily required to use all of the URIs provided. A client may choose to use fewer, or even only one of the URIs. However, it must always ensure the part size is between
getMinPartSize()
andgetMaxPartSize()
. These can reflect strict limitations of the storage provider.Regardless of the number of URIs used, they must be consumed in sequence, without skipping any, and the order of parts the original binary is split into must correspond exactly with the order of URIs.
For example, if a client wishes to upload a binary in three parts and there are five URIs returned, the client must use the first URI to upload the first part, the second URI to upload the second part, and the third URI to upload the third part. The client is not required to use the fourth and fifth URIs. However, using the second URI to upload the third part may result in either an upload failure or a corrupted upload; likewise, skipping the second URI to use subsequent URIs may result in either an upload failure or a corrupted upload.
While the API supports multi-part uploading via multiple upload URIs, implementations are not required to support multi-part uploading. If the underlying implementation does not support multi-part uploading, a single URI will be returned regardless of the size of the data being uploaded.
Security considerations:
- The URIs cannot be shared with other users. They must only be returned to authenticated requests corresponding to this session user or trusted system components.
- The URIs must not be persisted for later use and will typically be time limited.
- The URIs will only grant access to this particular binary.
- The client cannot infer any semantics from the URI structure and path names. It would typically include a cryptographic signature. Any change to the URIs will likely result in a failing request.
- Returns:
- Iterable of URIs that can be used for uploading directly to a storage location.
-
getMinPartSize
long getMinPartSize()
Return the smallest possible part size in bytes. If a consumer wants to choose a custom part size, it cannot be smaller than this value. This does not apply to the final part. This value will be equal or larger than zero.Note that the API offers no guarantees that using this minimal part size is possible with the number of available
getUploadURIs()
. This might not be the case if the binary is too large. Please refer to the upload algorithm for the correct use of this value.- Returns:
- The smallest part size acceptable for multi-part uploads.
-
getMaxPartSize
long getMaxPartSize()
Return the largest possible part size in bytes. If a consumer wants to choose a custom part size, it cannot be larger than this value. If this returns -1, the maximum is unlimited.The API guarantees that a client can split the binary of the requested size using this maximum part size and there will be sufficient URIs available in
getUploadURIs()
. Please refer to the upload algorithm for the correct use of this value.- Returns:
- The maximum part size acceptable for multi-part uploads or -1 if there is no limit.
-
getUploadToken
@NotNull @NotNull java.lang.String getUploadToken()
Returns a token identifying this upload. This is required to finalize the upload at the end by callingJackrabbitValueFactory.completeBinaryUpload(String)
.The format of this string is implementation-dependent. Implementations must ensure that clients cannot guess tokens for existing binaries.
- Returns:
- A unique token identifying this upload.
-
-