Skip to content

Commit

Permalink
Merge pull request #23 from smart-on-fhir/dtp/18-manual-retry
Browse files Browse the repository at this point in the history
Feature #18: Retry FileDownloads
  • Loading branch information
Dtphelan1 authored Jan 5, 2024
2 parents 70430b4 + fe57065 commit b2a14a7
Show file tree
Hide file tree
Showing 11 changed files with 273 additions and 34 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,4 @@ config/*
!.gitkeep
!config/defaults.js
.DS_Store
.vscode
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ Group-level export
node . --_since 2010-03 --_type Patient, Observations
```

Patient-level export with debugging information printed to the console
```sh
export NODE_DEBUG=app-request; node . -f https://builk-data.smarthealthit.org/fhir
```

For more options see the CLI parameters and configuration options below.

## Installation
Expand Down Expand Up @@ -133,7 +138,12 @@ The Bulk Data Client uses `js` configuration files, but you can think of them as
- *string* **`log.file`** - Path to the log file. Absolute, or relative to process CWD. If not provided, the file will be called log.ndjson and will be stored in the downloads folder.
- *object* **`log.metadata`** - Key/value pairs to be added to every log entry. Can be used to add useful information (for example which site imported this data).
- *number* **`retryAfterMSec`** - If the server does not provide `Retry-after` header use this number of milliseconds before checking the status again.

- *complex* **`logResponseHeaders`** - ResponseHeaders to include in error logs for debugging purposes.
- As for the complex type, valid values are `"all" | "none" | string | RegExp | (string | RegExp)[]`
- When `"all"` is specified, all responseHeaders are returned. When `"none"` is specified, no responseHeaders are returned. Otherwise, log any responseHeaders matches against 1...* strings/regexp
- *object* **`fileDownloadRetry`** - A subset of got retry configuration object, determining retry behavior when downloading files.
- For most scenarios, an object with only a `limit`: `number` property will be sufficient. This determines how many times a file download will be retried before failing. Each subsequent attempt will delay using an exponential backoff.
- For more details on options, see [https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md](https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md).

### Environment Variables
There are two environment that can be passed to the client to modify it's behavior.
Expand Down
9 changes: 7 additions & 2 deletions built/lib/BulkDataClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -509,7 +509,8 @@ class BulkDataClient extends events_1.EventEmitter {
let downloadStream = await download.run({
accessToken,
signal: this.abortController.signal,
requestOptions: this.options.requests
requestOptions: this.options.requests,
fileDownloadRetry: this.options.fileDownloadRetry,
})
.catch(e => {
if (e instanceof errors_1.FileDownloadError) {
Expand Down Expand Up @@ -561,7 +562,11 @@ class BulkDataClient extends events_1.EventEmitter {
itemType: "attachment",
resourceType: null
});
return this.request(options, "Attachment");
return this.request({
...options,
// Retry behavior should be the same as the fileDownloadRetry behavior
retry: this.options.fileDownloadRetry,
}, "Attachment");
},
onDownloadComplete: (url, byteSize) => {
this.emit("downloadComplete", {
Expand Down
69 changes: 63 additions & 6 deletions built/lib/FileDownload.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ class FileDownload extends events_1.default {
super();
this.url = url;
this.state = {
numTries: 0,
downloadedChunks: 0,
downloadedBytes: 0,
uncompressedBytes: 0,
Expand All @@ -26,16 +27,59 @@ class FileDownload extends events_1.default {
emit(eventName, ...args) {
return super.emit(eventName, this.getState(), ...args);
}
/**
* An exponential backoff delay function for file-download retries
* Based on got/dist/source/core/calculate-retry-delay.js
* This is only needed because GOT's support for stream retrying isn't working as needed for our version
* This may change with future versions of GOT
* @returns the number of milliseconds to wait before the next request; 0 if no retry is needed
*/
calculateRetryDelay({ attemptCount, retryOptions, response, retryAfter }) {
if (attemptCount > retryOptions.limit) {
return 0;
}
const hasMethod = retryOptions.methods.includes(response.request.options.method);
const hasStatusCode = retryOptions.statusCodes.includes(response.statusCode);
if (!hasMethod || !hasStatusCode) {
return 0;
}
if (retryAfter) {
if (retryOptions.maxRetryAfter === undefined || retryAfter > retryOptions.maxRetryAfter) {
return 0;
}
return retryAfter;
}
// 413 CONTENT TOO LARGE: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/413
// No use in retrying, the server can't handle a request this large
if (response.statusCode === 413) {
return 0;
}
const noise = Math.random() * 100;
return ((2 ** (attemptCount - 1)) * 1000) + noise;
}
/**
* The actual request will be made immediately but the returned stream will
* be in paused state. Pipe it to some destination and unpause to actually
* "consume" the downloaded data.
*/
run(options = {}) {
const { signal, accessToken, requestOptions = {} } = options;
const { signal, accessToken, requestOptions = {}, fileDownloadRetry } = options;
this.state.numTries += 1;
return new Promise((resolve, reject) => {
const options = {
// To merge requestOptions.retry and fileDownloadRetry, we handle the cases where options could be
// a number or an object
let retryOption = requestOptions.retry;
if (typeof (retryOption) === "number") {
// When the retry option is a number, always go with the fileDownloadRetry value
retryOption = fileDownloadRetry;
}
else {
// When retry is an object, we should merge the two, with fileDownloadReady taking precedence
retryOption = { ...retryOption, ...fileDownloadRetry };
}
const localOptions = {
...requestOptions,
retry: retryOption,
decompress: false,
responseType: "buffer",
throwHttpErrors: false,
Expand All @@ -45,11 +89,11 @@ class FileDownload extends events_1.default {
}
};
if (accessToken) {
options.headers.authorization = `Bearer ${accessToken}`;
localOptions.headers.authorization = `Bearer ${accessToken}`;
}
this.state.requestOptions = { ...options, url: this.url };
this.state.requestOptions = { ...localOptions, url: this.url };
this.emit("start");
const downloadRequest = request_1.default.stream(this.url, options);
const downloadRequest = request_1.default.stream(this.url, localOptions);
if (signal) {
const abort = () => {
debug("Aborting download request");
Expand All @@ -72,11 +116,24 @@ class FileDownload extends events_1.default {
this.state.downloadedBytes += data.length;
});
// Everything else happens after we get a response -----------------
downloadRequest.on("response", res => {
downloadRequest.on("response", (res) => {
const delay = this.calculateRetryDelay({
attemptCount: this.state.numTries,
retryOptions: res.request.options.retry,
response: res
});
if (delay > 0) {
return (0, utils_1.wait)(delay, signal).then(() => {
// Destroy this current request before making another one
downloadRequest.destroy();
return this.run(options);
}).then(resolve, reject);
}
// In case we get an error response ----------------------------
if (res.statusCode >= 400) {
return reject(new errors_1.FileDownloadError({
fileUrl: this.url,
// @ts-ignore
body: res.body,
responseHeaders: res.headers,
code: res.statusCode
Expand Down
13 changes: 12 additions & 1 deletion config/defaults.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
* The Bulk Data server token URL ("none" for open servers)
*/
tokenUrl: "none",

/**
* The private key (JWK) used to sign authentication tokens. This is not
* needed for open servers
Expand Down Expand Up @@ -300,4 +300,15 @@
* NOTE: When an empty array is specified, an empty object of responseHeaders will be returned
*/
logResponseHeaders: 'all',

/**
* A subset of got retry configuration object, determining retry behavior when downloading files.
* For most scenarios, an object with only a `limit`: `number` property will be sufficient.
* This determines how many times a file download will be retried before failing.
* Each subsequent attempt will delay using an exponential backoff.
* For more details on options, see [https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md](https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md).
*/
fileDownloadRetry: {
limit: 5,
},
}
32 changes: 30 additions & 2 deletions index.d.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { OptionsOfUnknownResponseBody } from "got/dist/source";
import { Method, OptionsOfUnknownResponseBody } from "got/dist/source";
import { Algorithm } from "jsonwebtoken"
import jose from "node-jose"

Expand Down Expand Up @@ -225,7 +225,21 @@ export declare namespace BulkDataClient {
* Otherwise, log any responseHeaders matches against 1...* strings/regexp
* NOTE: When an empty array is specified, an empty object of responseHeaders will be returned
*/
logResponseHeaders: "all" | "none" | string | RegExp | (string | RegExp)[]
logResponseHeaders: "all" | "none" | string | RegExp | (string | RegExp)[],

/**
* A subset of got retry configuration object, determining retry behavior when downloading files.
* For most scenarios, an object with only a `limit`: `number` property will be sufficient.
* This determines how many times a file download will be retried before failing.
* Each subsequent attempt will delay using an exponential backoff.
* For more details on options, see [https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md](https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md).
*/
fileDownloadRetry: {
limit?: number; // The number of times to retry
methods?: Method[]; // The HTTP methods on which we should retry
statusCodes?: number[]; // The status codes to retry on
maxRetryAfter?: number; // The maximum amount of time we should wait to retry
},
}

interface LoggingOptions
Expand Down Expand Up @@ -493,6 +507,20 @@ export declare namespace BulkDataClient {
* NOTE: When an empty array is specified, an empty object of responseHeaders will be returned
*/
logResponseHeaders: "all" | "none" | string | RegExp | (string | RegExp)[]

/**
* A subset of got retry configuration object, determining retry behavior when downloading files.
* For most scenarios, an object with only a `limit`: `number` property will be sufficient.
* This determines how many times a file download will be retried before failing.
* Each subsequent attempt will delay using an exponential backoff.
* For more details on options, see [https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md](https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md).
*/
fileDownloadRetry: {
limit?: number; // The number of times to retry
methods?: Method[]; // The HTTP methods on which we should retry
statusCodes?: number[]; // The status codes to retry on
maxRetryAfter?: number; // The maximum amount of time we should wait to retry
},
}

interface JWK {
Expand Down
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 8 additions & 3 deletions src/lib/BulkDataClient.ts
Original file line number Diff line number Diff line change
Expand Up @@ -718,8 +718,9 @@ class BulkDataClient extends EventEmitter
// Start the download (the stream will be paused though)
let downloadStream: Readable = await download.run({
accessToken,
signal: this.abortController.signal,
requestOptions: this.options.requests
signal : this.abortController.signal,
requestOptions : this.options.requests,
fileDownloadRetry : this.options.fileDownloadRetry,
})
.catch(e => {
if (e instanceof FileDownloadError) {
Expand Down Expand Up @@ -769,7 +770,11 @@ class BulkDataClient extends EventEmitter
itemType : "attachment",
resourceType: null
})
return this.request(options, "Attachment")
return this.request({
...options,
// Retry behavior should be the same as the fileDownloadRetry behavior
retry: this.options.fileDownloadRetry,
}, "Attachment")
},
onDownloadComplete: (url, byteSize) => {
this.emit("downloadComplete", {
Expand Down
Loading

0 comments on commit b2a14a7

Please sign in to comment.