Querying a Federation via the HeFQUIN Command-Line Program
HeFQUIN comes with several command-line programs, including one to run queries over a federation directly from the command line. This page focuses on that program.
You can use the program as follows, assuming the file MyFedConf.ttl
contains the description of your federation and the file ExampleQuery.rq
contains the query that you want to run over this federation, and assuming that you have added the HeFQUIN programs to your PATH
(otherwise, enter the directory in which you have unpacked the HeFQUIN release and replace hefquin
by bin/hefquin
).
hefquin --federationDescription=MyFedConf.ttl --query=ExampleQuery.rq
Further arguments can be passed to the program, which are described in detail below. You can also have the list of supported arguments printed by executing the program with the argument --help
.
hefquin --help
Query-Related Arguments of the Program
--federationDescription=<file> (mandatory argument)
Refers to a file that contains an RDF-based description of the federation to be queried. The file can be in any RDF serialization format (e.g., Turtle, N-Triples, JSON-LD) and it must use the HeFQUIN Federation Vocabulary. Every instance of type fd:FederationMember
described by this document will be considered as a member of the federation to be queried.
--query=<file> (mandatory argument)
Refers to a file that contains the query to be executed over the federation.
--base=<uri>
This argument can be used to provide a base URI for the query.
--skipExecution
Use this argument if you want HeFQUIN to create the query execution plan for the given query, but without actually executing this plan. This argument would typically be used in combination with arguments for printing the plan in its various stages (--printSourceAssignment, --printLogicalPlan, and --printPhysicalPlan).
Result-Related Arguments of the Program
--results=<format>
Use this argument to specify the format in which you would like the query result to be printed. The possible values for this argument depend on the type of query. For SELECT queries, the possible values are the following (with TEXT being the default in case the argument is omitted).
- TEXT - Simple text output in table form (default).
- JSON - Standard JSON format.
- XML - Standard XML format.
- CSV - Standard CSV format.
- TSV - Standard TSV format.
- TTL - RDF representation, serialized using the RDF Turtle format.
- NTRIPLES - RDF representation, serialized using the N-Triples format.
- JSONLD - RDF representation, serialized using the JSON-LD format.
- RDF/XML - RDF representation, serialized using the RDF/XML format.
(all RDF representations use the result set vocabulary).
For CONSTRUCT queries, the possible values are the following (with TTL being the default).
- TTL - RDF Turtle format (default).
- NTRIPLES - N-Triples format.
- JSONLD - JSON-LD format.
- RDF/XML - RDF/XML format.
--suppressResultPrintout
Use this argument if you want HeFQUIN to execute the query without printing the query result. This argument would typically be used in combination with arguments for measuring query processing times (--time, --printQueryProcMeasurements, and --printQueryProcStats), for which the actual printing of the query result is irrelevant and, in case of the --time argument, may even affect the measurements.
Planning-Related Arguments of the Program
--printSourceAssignment
Use this argument to instruct the query planning component of HeFQUIN to print the initial logical plan that is the output of the source selection & query decomposition process; i.e., this is the logical plan that is passed as input to the logical query optimizer. If you are interested only in seeing the plan, without executing it, use this argument in combination with the --skipExecution argument.
--printLogicalPlan
Use this argument to instruct the query planning component of HeFQUIN to print the logical plan that is the output of the logical query optimizer; i.e., this is the logical plan that is passed as input to the physical query optimizer. If you are interested only in seeing the plan, without executing it, use this argument in combination with the --skipExecution argument.
--printPhysicalPlan
Use this argument to instruct the query planning component of HeFQUIN to print the physical plan that is the output of the physical query optimizer; i.e., this is the physical plan that is used to execute the query. If you are interested only in seeing the plan, without executing it, use this argument in combination with the --skipExecution argument.
Experiments-Related Arguments of the Program
--printQueryProcMeasurements
Use this argument to instruct the HeFQUIN engine to print one line of comma-separated measurement values for each query. These values are:
- the overall query processing time (in ms), which is the sum of the following three measurements;
- the query planning time (in ms), which includes source planning, logical optimization, and physical optimization;
- the compilation time (in ms), which is about translating the final physical plan into a so-called executable plan that is set up and ready for execution;
- the query execution time (in ms), which is the total time from starting the executable plan until completion of the executable plan.
Note that these measurements are purely related to processing that takes place within the HeFQUIN engine; any pre-processing and post-processing performed by the Jena query processing machinery that HeFQUIN uses is not considered for these measurements.
Since these measurements are printed as a single line in which the different values are separated by commas, this argument can be used to produce a CSV file of measurements for multiple queries (or for multiple executions of the same query). For this use case, you may want to use this argument in combination with the --suppressResultPrintout argument.
--printQueryProcStats
Use this argument to instruct the HeFQUIN engine to print detailed statistics about the query planning and the query execution processes, as collected by the different components involved in these processes (including the operators that have been part of the query execution plan).
--printFedAccessStats
Use this argument to instruct the HeFQUIN engine to print detailed statistics collected by the federation access component of HeFQUIN.
--time
Use this argument to print the overall query processing time (in sec), which includes not only the processing within the HeFQUIN engine but also the pre-processing and post-processing performed by the Jena query processing machinery that HeFQUIN uses. To obtain a more detailed breakdown of the times required for the processing stages within the HeFQUIN engine, use the --printQueryProcMeasurements argument instead.
Notice that the time reported when using this argument may be affected by the time required for printing the query result, while the times reported when using the --printQueryProcMeasurements argument are unaffected because query result printing is part of the post-processing. As the mere printing time is irrelevant from the actual query processing perspective, you may want to disable the printing when using this argument, which can be done by adding the --suppressResultPrintout argument.
Other Arguments of the Program
--confDescr=<file>
Refers to a file that contains an RDF-based description of the configuration to be used for the HeFQUIN engine and its components. The file can be in any RDF serialization format (e.g., Turtle, N-Triples, JSON-LD), it must use the HeFQUIN Configuration Vocabulary, and contain exactly one instance of type ec:HeFQUINEngineConfiguration
.
This argument is relevant primarily during development and for conducting experiments with different internal components.
--help
If this argument is given, the program prints a list of all supported arguments and terminates (i.e., all other given arguments are ignored).