As this has matured through connectathon experience and essentially agreed upon, the formal HL7 artifacts are being produced, and can be found here:
And the community discussions and questions around this draft specification are here:
Note: The current bulk data handling specification does not handle deleted items, and the recommendation is that periodically a complete download should be done to check for "gaps" to reconcile the deletions (which could also be due to security changes), however content shouldn't usually be "deleted" it should be marked as inactive, or end dated.
Proposal: Include deletions bundle(s) for each resource type to report the deletions (when using the _since parameter) which would be in a new property "deletions" in the process output, as demonstrated in the status tracking output section below. This bundle would have a type of "collection", and each entry would be as per a deleted item in a history
<!-- no resource included for a delete -->
<!-- response carries the instant the server processed the delete -->
The total in the bundle will just be the count of deletions in the file, the total in the operation result will indicate the number of deletion bundles in the ndjson (same as the other types).
If the caller doesn't want to use the deletions, they can just ignore the files in the output, and not download those specific files.
List defined subsets
The previous sections are all that is defined by the FHIR Bulk Data extract specification, however we may choose to implement an additional parameter to this operation to permit the selection to also filter to resources that are included in a specified list resource. The approach is similar to the same capability defined by FHIR http://hl7.org/fhir/search.html#list
This functionality should be the subject of a connectathon to determine practical solutions.
One possibility could be to leverage the List functionality described above to maintain a state of what was included in a previous content, however this obviously incurs additional overhead on the part of the server, and for many systems - especially those at scale like a national system - this is likely not practical.
Format of the bulk data extract
This then tells the server that we are all finished with the data, and it can be deleted/cleaned up. The server may also include some time based limits where it may only keep it for a set period of time before it automatically cleans it up.
A close relative to the bulk data extract is the subscriptions content - and how these will work in the context of Bulk Directory exchanges needs further experimentation and connectathon experiences.
These could be setup to monitor the directory for realtime changes to resources of interest, and are defined through the use of search parameters