Version: v2

Google Cloud Storage

Sync Expectations

The following table describes the sync behavior of Fullstory events to Google Cloud Storage in parquet files.

Sync Interval	Relevant Timestamps	Setup Guide
Approximately `processed_time` rounded to the next hour + 1 hour	`event_time` `ingested_time` `processed_time` `updated_time`	Help Doc

Duplicate events: We guarantee at least once delivery for each event. Events may be duplicated in your object storage (less than 1%).

Parquet File Path

The event parquet file path follows the hive partitioning layout (key=value, supported by BigQuery, etc.) and includes an ingested_time column that is formatted in UTC and represents the time that the events were ingested by Fullstory's servers, truncated to seconds. For example:

 fullstory_<org_id>/events/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet

Note that the ingested time does not imply the range of the event time in a parquet file. Use a larger range when referencing ingested_time to make sure the target event time range is included in a query.

Defined Object Parquet File Paths

Incremental changes to any defined objects are stored as new Parquet files in the destination.

The path follows the same hive partitioning layout as the events parquet file path and includes an ingested_time column formatted in UTC that indicates the time of the sync, partitioned per day. The following examples correspond to the defined object parquet files:

fullstory_<org_id>/configuration/element-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
fullstory_<org_id>/configuration/event-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
fullstory_<org_id>/configuration/page-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet

Note that the data in the defined object parquet files may contain duplicates. To ensure you are using the most up-to-date data, always reference the latest modified_time when retrieving defined object records.

Data Schema

Events Parquet File Schema

The following table describes the schema for the parquet files containing Fullstory named element definitions.

Field	Type	Nullable	Description
`event_id`	`string`	`N`	The identifier for the event.
`event_time`	`string`	`Y`	The time in UTC that the event occurred, formatted as `yyyy-MM-ddTHH:mm:ss.SSSSSSZ`.
`processed_time`	`string`	`N`	The time in UTC that the event was packaged in the parquet file, formatted as `yyyy-MM-ddTHH:mm:ss.SSSSSSZ`.
`updated_time`	`long`	`Y`	If set, the time when the parquet file was recreated represented by the number of milliseconds since the epoch.
`device_id`	`long`	`N`	The device ID as defined in the base event model.
`session_id`	`long`	`Y`	The session ID as defined in the base event model.
`view_id`	`long`	`Y`	The view ID as defined in the base event model.
`event_type`	`string`	`N`	The type of this event.
`event_properties`	`string`	`N`	A json string containing the associated event properties.
`source_type`	`string`	`N`	The source type of the event.
`source_properties`	`string`	`N`	A json string containing the associated source properties.

Element Definitions Parquet File Schema

The following table describes the schema for Fullstory named element definitions parquet files.

Property	Type	Description
`id`	`string`	The named element's ID. Use the ID to join on the `element_definition_id`.
`name`	`string`	The name of the named element.
`description`	`string`	The description of the named element.
`state`	`string`	The state of the named element (e.g., active, archived).
`created_time`	`string`	Timestamp of when the named element was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
`created_by`	`string`	The Fullstory user that created the named element.
`modified_time`	`string`	Timestamp of when the named element was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
`modified_by`	`string`	The Fullstory user that last modified the named element.

Event Definitions Parquet File Schema

The following table describes the schema for Fullstory defined event definitions parquet files.

Property	Type	Description
`id`	`string`	The defined event's ID. Use the ID to join on the `event_definition_id`.
`name`	`string`	The name of the defined event.
`description`	`string`	The description of the defined event.
`state`	`string`	The state of the defined event (e.g., active, archived).
`created_time`	`string`	Timestamp of when the defined event was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
`created_by`	`string`	The Fullstory user that created the defined event.
`modified_time`	`string`	Timestamp of when the defined event was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
`modified_by`	`string`	The Fullstory user that modified the defined event.

Page Definitions Parquet File Schema

The following table describes the schema for Fullstory page definitions parquet files.

Property	Type	Description
`id`	`string`	The page's ID. Use the ID to join on the `page_definition_id`.
`fs_link_id`	`string`	The string ID that can be used to construct a url in Fullstory app, e.g. `https://app.fullstory.com/ui/<org_id>/settings/pages/<link_id>`
`name`	`string`	The name of the page.
`description`	`string`	The description of the page.
`is_user_defined`	`boolean`	`True` if the page was defined by the user, `False` if defined by a Fullstory algorithm.
`state`	`string`	The state of the page (e.g., active, archived).
`created_time`	`string`	Timestamp of when the page was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
`created_by`	`string`	The Fullstory user that created the page.
`modified_time`	`string`	Timestamp of when the page was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
`modified_by`	`string`	The Fullstory user that modified the page.

Querying as an External Table

The parquet files containing events can be queried as an external table. For example, a BigQuery external table can be created by setting the path of files as fullstory_<org_id>/events/*. The ingested_time column should be auto detected and used in queries to limit the number of parquet files scanned.

When querying the JSON fields in BigQuery, you should favor parsing with PARSE_JSON in a subquery to avoid the overhead of running JSON_EXTRACT_VALUE which combines a parse and then access via path.

To demonstrate how to query the events as an external table, the following example shows how to count the number of Rage Click broken down by browser and URL for a single day.

SELECT
  JSON_VALUE(source_properties, '$.user_agent.browser') AS browser,
  JSON_VALUE(source_properties, '$.url.path') AS path,
  COUNT(1) AS rage_clicks
FROM (
  SELECT
    event_id,
    event_time,
    ingested_time,
    processed_time,
    updated_time,
    device_id,
    session_id,
    view_id,
    event_type,
    PARSE_JSON(event_properties, wide_number_mode=>'round') AS event_properties,
    source_type,
    PARSE_JSON(source_properties, wide_number_mode=>'round') AS source_properties
  FROM
    `project.dataset.external_table` ) parsed
WHERE
  ingested_time BETWEEN '2024-01-01 00:00:00' AND '2024-01-03 00:00:00'
  AND event_time >= '2024-01-01T00:00:00.000000Z' AND event_time <= '2024-01-01T99:99:99.999999Z'
  AND event_type = 'click'
  AND CAST(JSON_VALUE(event_properties, '$.fs_rage_count') AS INTEGER) > 0
GROUP BY
  browser,
  path
ORDER BY
  rage_clicks DESC;

Sync Expectations​

Parquet File Path​

Defined Object Parquet File Paths​

Data Schema​

Events Parquet File Schema​

Element Definitions Parquet File Schema​

Event Definitions Parquet File Schema​

Page Definitions Parquet File Schema​

Querying as an External Table​