ehrQL output formats
Supported output formats🔗
The following output formats are supported:
 Recommended🔗
 Recommended🔗
- .arrow— Apache Arrow format
- .csv.gz— compressed CSV format
 Not recommended🔗
 Not recommended🔗
- .csv— uncompressed CSV format
 The uncompressed CSV format is not recommended,
because this produces much larger files than the alternative formats.
Unsupported output formats🔗
These formats were supported in cohort-extractor, but are not by ehrQL
- .dtaand- .dta.gz— Stata formats
arrowload for Stata users🔗
Stata itself does not directly support .arrow.
However, OpenSAFELY's Stata Docker image contains the arrowload library
that can load .arrow files in Stata.
Use arrowload as:
. arrowload /path/to/arrow/file
See the full documentation via running command-line Stata via OpenSAFELY:
opensafely exec stata-mp stata
and then running
. help arrowload
Selecting an output format🔗
You select an output format
when you use the --output option to specify an output filename for ehrQL.
The filename extension — for example, .arrow — that you provide determines the output format file.
If you specify a filename extension that is not supported, you will get an error telling you so.
 If you omit the 
--output option,
the output is not saved to a file.
Instead, the output is displayed at the command line.
Examples with opensafely exec🔗
.arrow🔗
opensafely exec ehrql:v0 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.arrow"
.csv.gz🔗
opensafely exec ehrql:v0 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.csv.gz"
Example project.yaml🔗
version: "3.0"
expectations:
  population_size: 1000
actions:
  extract_data:
    run: ehrql:v0 generate-dataset "./dataset_definition.py" --output "outputs/data_extract.arrow"
    outputs:
      highly_sensitive:
        population: outputs/data_extract.arrow
 The 
population filename must be identical to the output filename specified by --output.
Otherwise you will see the following error when you use opensafely run
to run the project actions:
$ opensafely run run_all
=> ProjectValidationError
   Invalid project:
   1 validation error for Pipeline
   __root__
     --output in run command and outputs must match (type=value_error)