Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


For the directory bulk data extraction, to request an entire copy of all content in the directory, the scope selection can be defined at the top level, and just specifying that we would like to retrieve all content for the specified resource types from the base of the FHIR server.

GET [base]/$export?_type=Organization,Location,Practitioner,PractitionerRole,HealthcareService,VerificationResult, ...

A healthcare directory may curate such an extract on a nightly process, and just return this without needing to scan the live system, and the value returned in the transactionTime in the result should contain the timestamp at which this was generated (including timezone information), and that should be used in a subsequent call to retrieve changes since this point in time.

Once a system has a complete set of data, it is usually more efficient to ask for changes since a point in time, in which case the request should use the value above (transactionTime) to update the local directory.

GET [base]/$export?_type=Organization,Location,Practitioner, ... &_since=[transactionTime]

This behaves just the same as the initial request, with the exception of the content.


Note: The current bulk data handling specification does not handle deleted items, and the recommendation is that periodically a complete download should be done to check for "gaps" to reconcile the deletions (which could also be due to security changes), however content shouldn't usually be "deleted" it should be marked as inactive, or end dated.

Proposal: Include a deletions bundle(s) for each resource type to report all the deletions (when using the _since parameter) which would be in a new property "deletions" in the process output, as demonstrated in the example status tracking output section below. This bundle would have a type of "collection", and each entry would be as per a deleted item in a history


    <!-- no resource included for a delete -->
      <method value="DELETE"/>
      <url value="


    <!-- response carries the instant the server processed the delete -->
      <lastModified value="2014-08-20T11:05:34.174Z"/>

The total in the bundle will just be the count of deletions in the file, the total in the operation result will indicate the number of deletion bundles in the ndjson (same as the other types).

List defined subsets

The previous sections are all that is defined by the FHIR Bulk Data extract specification, however we may choose to implement an additional parameter to this operation to permit the selection to also filter to resources that are included in a specified list resource. The approach is similar to the same capability defined by FHIR

This could be used by client applications such as a Primary Care System that wanted to only periodically update using this technique, but only with resources that they currently have loaded in their "local directory" - internal black book, which were cached there from previous searches to the system.

GET [base]/$export?_type=Organization,Location,Practitioner,PractitionerRole,HealthcareService&_list=List/45

In this example the Primary Care System would be responsible for keeping List/45 up to date with what it is tracking, and a national service may decide that permitting this List resource management is too much overhead, however local enterprise directories may support this type of functionality.


Here I will only document the use of the global export, as an initial request.

The initial request:

GET [base]/$export?_type=Organization,Location,Practitioner,PractitionerRole,HealthcareService
with headers:
Accept: application/fhir+json
Authentication: Bearer [bearer token]
Prefer: respond-async

This will return either:

  • a status 4XX or 5XX with an OperationOutcome resource body if the request fails,
  • or a status 202 Accepted when successful, with a Content-Location header with an absolute URI for subsequent status requests, and optionally an OperationOutcome in the resource body if desired


After a bulk data request has been started, the client MAY poll the URI provided in the Content-Location header.

This will return:

  • HTTP Status Code of 202 Accepted when still in progress (and no body returned)
  • HTTP status code of 5XX when a fatal error occurs, and an OperationOutcome in json format for the body with the detail of the error
    (Note this is a fatal error in processing, not some error encountered while processing files - a complete extract can contain errors)
  • HTTP status of 200 OK when the processing is complete, and the result is a json object as noted in the specification (and an example included below)


   "transactionTime": "[instant]",
   "request" : "[base]/$export?_type=Organization,Location,Practitioner,PractitionerRole,HealthcareService",
   "requiresAccessToken" : true,
   "output" : [{
     "type" : "Practitioner",
     "url" : "http://serverpath2/practitioner_file_1.ndjson",
     "count" : 10000
     "type" : "Practitioner",
     "url" : "http://serverpath2/practitioner_file_2.ndjson",
     "count" : 3017
     "type" : "Location",
     "url" : "http://serverpath2/location_file_1.ndjson",
     "count" : 4182

// Note that this deletions property is a proposal, not part of the bulk data spec.
   "deletions": [{
     "type": "


     "url": "http://serverpath2/




ndjson", // the bundle will include the total number of deletions in the file
     "count": 23 // this is the number of bundles in the file, not the number of resources deleted

   "error" : [{
     "type" : "OperationOutcome",
     "url" : "http://serverpath2/err_file_1.ndjson",
     "count" : 439

Retrieving the complete extract


While downloading, also recommend including the header Accept-Encoding: gzip to compress the content as it comes down.

GET http://serverpath2/location_file_1.ndjson

(Note: our implementation will probably always gzip encode the content - as we are likely to store the processing files gzip encoded to save space in the storage system)


This is the simplest part of the process, and that is just calling DELETE on the status tracking URL.

This then tells the server that we are all finished with the data, and it can be deleted/cleaned up. The server may also include some time based limits where it may only keep it for a set period of time before it automatically cleans it up.