Advanced Filtering in REST API

In this article we will look at how to implement advanced filtering in REST API when a GET request a list of results. In our API project that will be GET /v1/travels and GET /v1/tours endpoints. Although setup for filtering is similar to sorting, the implementation is a bit more complex.

First we are going to use filter_by parameter that will look something like ?filter_by=name:"Cool Tour" AND price:[10 TO 30] This is Liqe Query Language (LQL), which is heavily inspired by Lucene.

Next we need to convert Liqe expression into an abstract syntax tree (AST). Liqe npm package will do that for us.

Finally we create a Sequelize Where clause object from the AST. Although our implementation of AST to Where clause object translator won’t be complete, it will give you a good starting point so you can customize it and add to it to fit your project.

Safety and Performance Concerns

While we implement filtering, we have to keep in mind the following safety and performance concerns:

  1. Since we are using user input, there’s a chance of SQL injection. Because we are converting the AST into Sequelize Where clause object, Sequelize will sanitize user input for us. If you are planning to convert AST to plain SQL, please use SQL builder that will allow you compose SQL fragments safely by using tagged template literals.
  2. Our search interface can be used as a vector for DDoS attacks. Someone, intentionally or accidentally, can put something like this name:A OR name:B OR name:C or name:D... Such query will be computationally expensive. Since we have AST of the query, we can calculate complexity of the query and set an upper limit. In our case we will implement it by calculating how many levels deep AST object is and set an upper limit of 5.
  3. Someone can run queries against the database fields that are not optimized for filtering, for example description field. This will be also computationally expensive. We will prevent that by using allowedFilterByFields array that will include only fields we want the user to run filtering against.

How to Implement Filtering in REST API

Now, with the theory out of the way, let’s get started. If you prefer video tutorials, please check out our YouTube video on how to implement advanced filtering in REST API.

First let’s install Liqe library with the command:

yarn add liqe

As usual we are going to update update OpenApi document to include filter_by query parameter in GET /v1/travels and GET /v1/tours routes.

- in: query
          name: filter_by
          schema:
            type: string
          required: false
          description: Filter by query parameter. F.e. *name:"My Cool Travel"*

Similar to query_by filter_by parameter is also in query, type of string and is not required.

Let’s add the same parameter to GET /v1/tours

- in: query
          name: filter_by
          schema:
            type: string
          required: false
          description: Filter by query parameter. F.e. *price:<30*

Let’s go ahead an update the code to handle filter_by query parameter.

We update listTravels method in travels controller to include filterBy option in options passed to repository.getAll() method. filterBy is assigned the value of req.query.filer_by

const travels = TravelResource.collection(
      await repository.getAll({
        sortBy: req.query.sort_by,
        filterBy: req.query.filter_by,
      })
    );

We update listTours method in tours controller to include the same option filterBy

const tours = TourResource.collection(
      await repository.getAll({
        sortBy: req.query.sort_by,
        filterBy: req.query.filter_by,
      })
    );

Now, let’s update BaseRepository class to handle filterBy property in the options parameter.

First, in BaseRepository.ts file add protected class property called allowedFilterByFields

protected allowedFilterByFields: Array<string> = [];

A default value of an empty array is assigned to the property. As we mentioned before, we can’t allow users to run filters on all the fields in the table of the resource. This may result in computationally expensive queries, high database load, and, as a result, slow API. We want to allow a user filter only by a limited number of fields that are usually indexed.

Let’s put the same property in child classes TravelRepository and TourRepository

We will allow filter travels by name and number of days.

protected allowedFilterByFields = ["name", "number_of_days"];

We will allow filter tours by name, price, starting date, ending date

protected allowedFilterByFields = [
    "name",
    "price",
    "starting_date",
    "ending_date",
  ];

Let’s go back to BaseRepository and update getAll method to handle filterBy option.

if (options.filterBy) {
      const filterBy = this.getFilterBy(options.filterBy);
      delete options.filterBy;
      options.where = filterBy;
    }

We will check if filter by exists on options passed to getAll method. If it does, we call getFilterBy method on BaseRepository and pass it filter by options. Then we will remove filterBy property on options object and add a where property and assign it the value returned from getFilterBy method. where property is used Sequelize to query data.

Now let’s create getFilterBy method in BaseRepository

protected getFilterBy(filterBy: string): Record<string, any> {
    try {
      return getSequelizeWhereClause(
        parse(filterBy),
        this.allowedFilterByFields
      );
    } catch (error: any) {
      throw new ApiError({
        name: "FILTER_BY_ERROR",
        message: error.message,
        status: 400,
        code: "ERR_FTB",
      });
    }
  }

getFilterBy receives filterBy string and returns an object. The method attempts to call getSequelizeWhereClause function. As a first argument, it passes filterBy string parsed by parse method from Liqe library. The second argument to getSequelizeWhereClause function is allowedFilterByFields If the call is successful, result of the function is returned. If there is an error, it is handled by the catch statement and gets re-thrown as an ApiError with name of FILTER_BY_ERROR and code ERR_FTB The ErrorHandler will catch it and return it back to the API user as an error response.

Finally, in utils folder, let’s create getSequelizeWhereClause function. This function we turn abstract syntax tree (AST) parsed by Liqe library into Sequelize where clause object.

import { Op } from "sequelize";
import { type LiqeQuery } from "liqe";
import { depthOfObject } from "./functions";

const maxSearchQueryDepth = 5;

const operatorsMap: Record<string, string> = {
  ":": "eq",
  ":=": "eq",
  ":>": "gt",
  ":>=": "gte",
  ":<": "lt",
  ":<=": "lte",
};

const getSequelizeWhereClause = (
  searchQuery: LiqeQuery,
  allowedFilterByFields: Array<string>
) => {
  if (depthOfObject(searchQuery) > maxSearchQueryDepth) {
    throw new Error("Filter by has too many expressions");
  }
  const translateTagExpression = (ast: LiqeQuery) => {
    if (ast.type !== "Tag") {
      throw new Error("Expected a tag expression.");
    }
    const {
      field,
      expression,
      operator,
    }: {
      field: Record<string, any>;
      expression: Record<string, any>;
      operator: Record<string, any>;
    } = ast;

    if (field.type === "ImplicitField") {
      throw new Error("Implicit fields are not supported");
    }

    if (!allowedFilterByFields.includes(field.name)) {
      throw new Error(`Field ${field.name} is not allowed in filter by`);
    }

    if (expression.type === "RangeExpression") {
      return {
        [field.name]: {
          [Op.between]: [expression.range.min, expression.range.max],
        },
      };
    }

    if (expression.type !== "LiteralExpression") {
      throw new Error("Expected a literal expression.");
    }

    if (!operatorsMap[operator.operator]) {
      throw new Error("Operator is not supported.");
    }

    return {
      [field.name]: {
        // @ts-ignore
        [Op[operatorsMap[operator.operator]]]: expression.value,
      },
    };
  };

 const translateExpression = (ast: LiqeQuery): Record<string, any> => {
    if (ast.type === "Tag") {
      return translateTagExpression(ast);
    }

    if (ast.type === "LogicalExpression") {
      if (!operatorsMap[ast.operator.operator]) {
        throw new Error("Logical operator is not supported.");
      }
      return {
        // @ts-ignore
        [Op[operatorsMap[ast.operator.operator]]]: [
          translateExpression(ast.left),
          translateExpression(ast.right),
        ],
      };
    }

    throw new Error("AST type is missing or not supported");
  };

  return translateExpression(searchQuery);
};

export default getSequelizeWhereClause;

First we import Op from Sequelize, type LiqeQuery from Liqe and depthOfObject function from utils functions. depthOfObject will tell us how many levels deep is the AST object. Limiting depth of AST object will prevent running too expensive queries against the database and protect from possibility of DDoS attacks.

export function depthOfObject(object: Record<string, any>) {
  let level = 1;
  for (const key in object) {
    if (!object.hasOwnProperty(key)) continue;

    if (typeof object[key] == "object") {
      const depth = depthOfObject(object[key]) + 1;
      level = Math.max(depth, level);
    }
  }
  return level;
}

Next we will define maxSearchQueryDepth to be 5. This will limit filter by query to 3 expressions connected by two ANDs or ORs, or combination of them. For example: price:<30 AND starting_date:>2023-11-01 AND ending_date:<2023-11-30

We will also define operatorsMap that will map AST operators to Sequelize Op operators.

Finally we will defined getSequelizeWhereClause function. It first makes sure AST query object doesn’t exceed the max search query depth. Then, it defines two functions translateTagExperssion and translateExpression . Finally it calls translateExpression function with searchQuery AST object as an argument and returns the result of that call.

Let’s look at translateExperssion function. It accepts AST object that is type of LiqeQuery as an argument. It first checks if the expression is a Tag expression. If so, it returns a call to translateTagExpressionFunction. Next it checks if ast.type is LogicalExpression. If it is, it first makes sure that logical operator exists in the operators. Map. Then the function returns Sequelize Where clause object with the mapped operator and calls itself on the right and left sides of the expression. If AST type is not a Tag or Logical expression and error is thrown. Other AST types you can implement are UnaryOperator (NOT expression) and ParenthesizedExpression. Please see an article by Gajus Kuizinas, the creator of Liqe, in the description of the video.

Now let’s look at translateTagEpression function. It has an AST object as a parameter. The function checks if AST type is Tag If not, an exception is thrown. It destructs field , expression and operator from the AST object. It checks if field type is ImplicitField. For our purposes we don’t want to support implicit fields. If it makes sense for your application to have an implicit field, you can assign whatever field you want from your database if the field is not specified in the filter by expression.

Next we check if field name is included in allowed filter by fields. For safety reasons mentioned before, we don’t want let users filter through any field they want. If the field name is not in the allowed filter by array, we through an exception. Next, we check if the expression type is RangeExpression If so, we return Sequelize Where clause object that runs between query on expression.range.min and expression.range.max After that, we are almost done with the logic. The only thing is left to check if AST expression is LiteralExpression and if operatorsMap object has operator.operator as its key.

If both condition are satisfied, the function returns Sequelize where clause object that runs a query on the specified field name with the operator and expression value.

We are finally done. As I mentioned before, getSequelizeWhereClause function doesn’t cover every use case, but should give you a good idea of how to parse an AST object so you can customize it to fit your project.

Conclusion

In conclusion, mastering advanced filtering in your REST API is a pivotal step towards creating a resilient and user-friendly web service. By delving into the nuanced strategies, best practices, and practical tips shared in this article, you’ve equipped yourself with the tools to optimize data retrieval and enhance the overall efficiency of your API.

Resources

Source Code

Liqe Package

Gajus Kuizinas’ Article on Parsing AST

Share this article

Posted

in

by