Summarize Table Data

https://{your-workspace-slug}.{region}.xata.sh/db/db_branch_name/tables/table_name/summarize

This endpoint summarizes from your database. It comes with a range of functions to help perform calculations on the data you have stored in your tables

Expected parameters

NameDescriptionInRequiredSchema
db_branch_name

The DBBranchName matches the pattern {db_name}:{branch_name}.

path✅string
table_name

The Table name

path✅string

Summarize Table

POST
https://{your-workspace-slug}.{region}.xata.sh/db/db_branch_name/tables/table_name/summarize

This endpoint allows you to (optionally) define groups, and then to run calculations on the values in each group. This is most helpful when you'd like to understand the data you have in your database.

A group is a combination of unique values. If you create a group for sold_by, product_name, we will return one row for every combination of sold_by and product_name you have in your database. When you want to calculate statistics, you define these groups and ask Xata to calculate data on each group.

Some questions you can ask of your data:

How many records do I have in this table?

  • Set columns: [] as we we want data from the entire table, so we ask for no groups.
  • Set summaries: {"total": {"count": "*"}} in order to see the count of all records. We use count: * here we'd like to know the total amount of rows; ignoring whether they are null or not.

What are the top total sales for each product in July 2022 and sold more than 10 units?

  • Set filter: {soldAt: { "$ge": "2022-07-01T00:00:00.000Z", "$lt": "2022-08-01T00:00:00.000Z"} } in order to limit the result set to sales recorded in July 2022.
  • Set columns: [product_name] as we'd like to run calculations on each unique product name in our table. Setting columns like this will produce one row per unique product name.
  • Set summaries: {"total_sales": {"count": "product_name"}} as we'd like to create a field called "total_sales" for each group. This field will count all rows in each group with non-null product names.
  • Set sort: [{"total_sales": "desc"}] in order to bring the rows with the highest total_sales field to the top.
  • Set summariesFilter: {"total_sales": {"$ge": 10}} to only send back data with greater than or equal to 10 units.

columns: tells Xata how to create each group. If you add product_id we will create a new group for every unique product_id.

summaries: tells Xata which calculations to run on each group. Xata currently supports count, min, max, sum, average.

sort: tells Xata in which order you'd like to see results. You may sort by fields specified in columns as well as the summary names defined in summaries.

note: Sorting on summarized values can be slower on very large tables; this will impact your rate limit significantly more than other queries. Try use filter to reduce the amount of data being processed in order to reduce impact on your limits.

summariesFilter: tells Xata how to filter the results of a summary. It has the same syntax as filter, however, by using summariesFilter you may also filter on the results of a query.

note: This is a much slower to use than filter. We recommend using filter wherever possible and summariesFilter when it's not possible to use filter.

page.size: tells Xata how many records to return. If unspecified, Xata will return the default size.

Request Body Type Definition

type SummarizeTable = {
    filter?: FilterExpression;
    columns?: ColumnsProjection;
    summaries?: SummaryExpressionList;
    sort?: SortExpression;
    summariesFilter?: FilterExpression;
    /**
     * The consistency level for this request.
     *
     * @default strong
     */
    consistency?: "strong" | "eventual";
    page?: {
        /**
         * The number of records returned by summarize. If the amount of data you have exceeds this, or you have
         * more complex reporting requirements, we recommend that you use the aggregate endpoint instead.
         *
         * @default 20
         * @maximum 1000
         * @minimum 1
         */
        size?: number;
    };
};
 
/**
 * @minProperties 1
 */
type FilterExpression = {
    $exists?: string;
    $existsNot?: string;
    $any?: FilterList;
    $all?: FilterList;
    $none?: FilterList;
    $not?: FilterList;
} & {
    [key: string]: FilterColumn;
};
 
/**
 * @example name
 * @example email
 * @example created_at
 */
type ColumnsProjection = string[];
 
/**
 * The description of the summaries you wish to receive. Set each key to be the field name
 * you'd like for the summary. These names must not collide with other columns you've
 * requested from `columns`; including implicit requests like `settings.*`.
 * 
 * The value for each key needs to be an object. This object should contain one key and one 
 * value only. In this object, the key should be set to the summary function you wish to use
 * and the value set to the column name to be summarized.
 * 
 * The column being summarized cannot be an internal column (id, xata.*), nor the base of
 * an object, i.e. if `settings` is an object with `dark_mode` as a field, you may summarize
 * `settings.dark_mode` but not `settings` nor `settings.*`.
 * 
 * @example {"all_users":{"count":"*"},"total_created":{"count":"created_at"},"min_cost":{"min":"cost"},"max_happiness":{"max":"happiness"},"total_revenue":{"sum":"revenue"},"average_speed":{"average":"speed"}}
 */
type SummaryExpressionList = {
    [key: string]: SummaryExpression;
};
 
type SortExpression = string[] | {
    [key: string]: SortOrder;
} | {
    [key: string]: SortOrder;
}[];
 
type FilterList = FilterExpression | FilterExpression[];
 
type FilterColumn = FilterColumnIncludes | FilterPredicate | FilterList;
 
/**
 * A summary expression is the description of a single summary operation. It consists of a single
 * key representing the operation, and a value representing the column to be operated on.
 * 
 * The column being summarized cannot be an internal column (id, xata.*), nor the base of
 * an object, i.e. if `settings` is an object with `dark_mode` as a field, you may summarize
 * `settings.dark_mode` but not `settings` nor `settings.*`.
 * 
 * We currently support several aggregation functions. Not all functions can be run on all column
 * types.
 * 
 *   - `count` is used to count the number of records in each group. Use `{"count": "*"}` to count
 *     all columns present, otherwise `{"count": "<column_path>"}` to count the number of non-null
 *     values are present at column path.
 * 
 *     Count can be used on any column type, and always returns an int.
 * 
 *   - `min` calculates the minimum value in each group. `min` is compatible with most types;
 *     string, multiple, text, email, int, float, and datetime. It returns a value of the same
 *     type as operated on. This means that `{"lowest_latency": {"min": "latency"}}` where
 *     `latency` is an int, will always return an int.
 * 
 *   - `max` calculates the maximum value in each group. `max` shares the same compatibility as
 *     `min`.
 * 
 *   - `sum` adds up all values in a group. `sum` can be run on `int` and `float` types, and will
 *     return a value of the same type as requested.
 * 
 *   - `average` averages all values in a group. `average` can be run on `int` and `float` types, and
 *     always returns a float.
 * 
 * @example {"count":"deleted_at"}
 */
type SummaryExpression = Record<string, any>;
 
type SortOrder = "asc" | "desc" | "random";
 
/**
 * @maxProperties 1
 * @minProperties 1
 */
type FilterColumnIncludes = {
    $includes?: FilterPredicate;
    $includesAny?: FilterPredicate;
    $includesAll?: FilterPredicate;
    $includesNone?: FilterPredicate;
};
 
type FilterPredicate = FilterValue | FilterPredicate[] | FilterPredicateOp | FilterPredicateRangeOp;
 
/**
 * Xata Table Record Metadata
 */
type Record = RecordMeta & {
    [key: string]: any;
};
 
type FilterValue = number | string | boolean;
 
/**
 * @maxProperties 1
 * @minProperties 1
 */
type FilterPredicateOp = {
    $any?: FilterPredicate[];
    $all?: FilterPredicate[];
    $none?: FilterPredicate | FilterPredicate[];
    $not?: FilterPredicate | FilterPredicate[];
    $is?: FilterValue | FilterValue[];
    $isNot?: FilterValue | FilterValue[];
    $lt?: FilterRangeValue;
    $le?: FilterRangeValue;
    $gt?: FilterRangeValue;
    $ge?: FilterRangeValue;
    $contains?: string;
    $startsWith?: string;
    $endsWith?: string;
    $pattern?: string;
};
 
/**
 * @maxProperties 2
 * @minProperties 2
 */
type FilterPredicateRangeOp = {
    $lt?: FilterRangeValue;
    $le?: FilterRangeValue;
    $gt?: FilterRangeValue;
    $ge?: FilterRangeValue;
};
 
/**
 * Xata Table Record Metadata
 */
type RecordMeta = {
    id: RecordID;
    xata: {
        /**
         * The record's version. Can be used for optimistic concurrency control.
         */
        version: number;
        /**
         * The time when the record was created.
         */
        createdAt?: string;
        /**
         * The time when the record was last updated.
         */
        updatedAt?: string;
        /**
         * The record's table name. APIs that return records from multiple tables will set this field accordingly.
         */
        table?: string;
        /**
         * Highlights of the record. This is used by the search APIs to indicate which fields and parts of the fields have matched the search.
         */
        highlight?: {
            [key: string]: string[] | {
                [key: string]: any;
            };
        };
        /**
         * The record's relevancy score. This is returned by the search APIs.
         */
        score?: number;
        /**
         * Encoding/Decoding errors
         */
        warnings?: string[];
    };
};
 
type FilterRangeValue = number | string;
 
/**
 * @maxLength 255
 * @minLength 1
 * @pattern [a-zA-Z0-9_-~:]+
 */
type RecordID = string;

Responses

type SummarizeTable = {
    summaries: Record<string, any>[];
};
 
/**
 * Xata Table Record Metadata
 */
type Record = RecordMeta & {
    [key: string]: any;
};
 
/**
 * Xata Table Record Metadata
 */
type RecordMeta = {
    id: RecordID;
    xata: {
        /**
         * The record's version. Can be used for optimistic concurrency control.
         */
        version: number;
        /**
         * The time when the record was created.
         */
        createdAt?: string;
        /**
         * The time when the record was last updated.
         */
        updatedAt?: string;
        /**
         * The record's table name. APIs that return records from multiple tables will set this field accordingly.
         */
        table?: string;
        /**
         * Highlights of the record. This is used by the search APIs to indicate which fields and parts of the fields have matched the search.
         */
        highlight?: {
            [key: string]: string[] | {
                [key: string]: any;
            };
        };
        /**
         * The record's relevancy score. This is returned by the search APIs.
         */
        score?: number;
        /**
         * Encoding/Decoding errors
         */
        warnings?: string[];
    };
};
 
/**
 * @maxLength 255
 * @minLength 1
 * @pattern [a-zA-Z0-9_-~:]+
 */
type RecordID = string;