Class BaseForExecOpSequentialBindJoin<QueryType extends Query,MemberType extends FederationMember>

java.lang.Object
se.liu.ida.hefquin.engine.queryplan.executable.impl.ops.BaseForExecOps
se.liu.ida.hefquin.engine.queryplan.executable.impl.ops.UnaryExecutableOpBase
se.liu.ida.hefquin.engine.queryplan.executable.impl.ops.BaseForExecOpSequentialBindJoin<QueryType,MemberType>
All Implemented Interfaces:
StatsProvider, ExecutableOperator, UnaryExecutableOp
Direct Known Subclasses:
BaseForExecOpSequentialBindJoinSPARQL, ExecOpSequentialBindJoinBRTPF

public abstract class BaseForExecOpSequentialBindJoin<QueryType extends Query,MemberType extends FederationMember> extends UnaryExecutableOpBase
A generic implementation of batch-based bind-join algorithm that performs the bind-join requests sequentially, one after another, for which it uses executable request operators. The implementation is generic in the sense that it works with any type of request operator. Each concrete implementation that extends this base class needs to implement the createExecutableReqOp(Set) method to create the request operators with the types of requests that are specific to that concrete implementation. The algorithm collects solution mappings from the input. Once enough solution mappings have arrived, the algorithm creates the corresponding request (see above) and sends this request to the federation member (the algorithm may even decide to split the input batch into smaller batches for multiple requests; see below). The response to such a request is the subset of the solutions for the query/pattern of this operator that are join partners for at least one of the solutions that were used for creating the request. After receiving such a response, the algorithm locally joins the solutions from the response with the solutions in the batch used for creating the request, and outputs the resulting joined solutions (if any). Thereafter, the algorithm moves on to collect the next solution mappings from the input, until it can do the next request, etc. This implementation is capable of separating out each input solution mapping that assigns a blank node to any of the join variables. Then, such solution mappings are not even considered when creating the requests because they cannot have any join partners in the results obtained from the federation member. Of course, in case the algorithm is used with outer-join semantics, these solution mappings are still returned to the output (without joining them with anything). A feature of this implementation is that, in case a request operator fails, this implementation automatically reduces the batch size for requests and, then, tries to re-process (with the reduced request batch size) the input solution mappings for which the request operator failed. Another feature of this implementation is that it can switch into a full-retrieval mode as soon as there is an input solution mapping that does not have a binding for any of the join variables (which may happen only in cases in which at least one of the join variables is a certain variable). Such an input solution mapping is compatible with (and, thus, can be joined with) every solution mapping that the federation member has for the query/pattern of this bind-join operator. Therefore, when switching into full-retrieval mode, this implementation performs a request to retrieve the complete set of all these solution mappings and, then, uses this set to find join partners for the current and the future batches of input solution mappings (because, with the complete set available locally, there is no need anymore to issue further bind-join requests). This capability relies on the createExecutableReqOpForAll() method that needs to be provided by each concrete implementation that extends this base class.
  • Field Details

    • DEFAULT_BATCH_SIZE

      public static final int DEFAULT_BATCH_SIZE
      See Also:
    • query

      protected final QueryType extends Query query
    • fm

      protected final MemberType extends FederationMember fm
    • varsInQuery

      protected final Set<org.apache.jena.sparql.core.Var> varsInQuery
    • useOuterJoinSemantics

      protected final boolean useOuterJoinSemantics
    • allJoinVarsAreCertain

      protected final boolean allJoinVarsAreCertain
    • requestBlockSize

      protected int requestBlockSize
      The number of solution mappings that this operator uses for each of the bind join requests. This number may be adapted at runtime.
    • minimumRequestBlockSize

      protected static final int minimumRequestBlockSize
      The minimum value to which requestBlockSize can be reduced.
      See Also:
    • currentBatch

      protected final List<SolutionMapping> currentBatch
      This set is used to collect up the input solution mappings (obtained from the child operator in the execution plan) for which the next bind-join request will ask for possible join partners. Note that these are not necessarily the solution mappings to be used for forming the next bind-join request; those are collected in parallel in currentSolMapsForRequest. Once the response received for the next bind-join request has been handled, this set will be cleared (and then populated again, by using the next input solution mappings that will arrive afterwards).
    • currentSolMapsForRequest

      protected final Set<org.apache.jena.sparql.engine.binding.Binding> currentSolMapsForRequest
      This set is used to collect up solution mappings that will be used to form the next bind-join request. These solution mappings will be created by restricting relevant input solution mappings (obtained from the child operator in the execution plan) to the join variables; i.e., projecting away the non-join variables, as the bindings for these do not need to be shipped in the bind-join requests. The corresponding input solution mappings from which the solution mappings in this set have been created are collected in parallel in currentBatch. It is possible that multiple input solution mappings may result in the same restricted solution mapping. Once the response received for the next bind-join request has been handled, this set will be cleared (and then populated again, by using the next input solution mappings that will arrive afterwards).
    • fullResult

      protected Iterable<SolutionMapping> fullResult
      In case that this operator had to switch to full-retrieval mode, this one contains all solution mappings retrieved for the query of this operator.
    • requestBlockSizeWasReduced

      protected boolean requestBlockSizeWasReduced
    • numberOfRequestOpsUsed

      protected int numberOfRequestOpsUsed
    • numOfSolMapsRetrievedPerReqOp

      protected List<Long> numOfSolMapsRetrievedPerReqOp
    • statsOfFirstReqOp

      protected ExecutableOperatorStats statsOfFirstReqOp
    • statsOfLastReqOp

      protected ExecutableOperatorStats statsOfLastReqOp
  • Constructor Details

    • BaseForExecOpSequentialBindJoin

      public BaseForExecOpSequentialBindJoin(QueryType query, Set<org.apache.jena.sparql.core.Var> varsInQuery, MemberType fm, ExpectedVariables inputVars, boolean useOuterJoinSemantics, int batchSize, boolean collectExceptions, QueryPlanningInfo qpInfo)
      Parameters:
      query - - the graph pattern (or other kind of query) to be evaluated (in a bind-join manner) at the federation member given as 'fm'
      varsInQuery - - the variables that occur in the 'query'
      fm - - the federation member targeted by this operator
      inputVars - - the variables to be expected in the solution mappings that will be pushed as input to this operator
      useOuterJoinSemantics - - true if the 'query' is to be evaluated under outer-join semantics; false for inner-join semantics
      batchSize - - the number of solution mappings to be included in each bind-join request; this value must not be smaller than minimumRequestBlockSize
      collectExceptions - - true if this operator has to collect exceptions (which is handled entirely by one of the super classes); false if the operator should immediately throw every ExecOpExecutionException
  • Method Details