Skip to main content
Version: v2

Amazon S3

Sync Expectations

The following table describes the sync behavior of Fullstory events to Amazon S3 in parquet files.

Sync IntervalRelevant TimestampsSetup Guide
Approximately processed_time rounded to the next hour + 1 hourevent_time
ingested_time
processed_time
updated_time
Help Doc

Parquet File Path

The event parquet file path follows the hive partitioning layout (key=value, supported by BigQuery, etc.) and includes an ingested_time column that is formatted in UTC and represents the time that the events were ingested by Fullstory's servers, truncated to seconds. For example:

 fullstory_<org_id>/events/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet

Note that the ingested time does not imply the range of the event time in a parquet file. Use a larger range when referencing ingested_time to make sure the target event time range is included in a query.

Defined Object Parquet File Paths

Incremental changes to any defined objects are stored as new Parquet files in the destination.

The path follows the same hive partitioning layout as the events parquet file path and includes an ingested_time column formatted in UTC that indicates the time of the sync, partitioned per day. The following examples correspond to the defined object parquet files:

fullstory_<org_id>/configuration/element-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
fullstory_<org_id>/configuration/event-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
fullstory_<org_id>/configuration/page-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet

Note that the data in the defined object parquet files may contain duplicates. To ensure you are using the most up-to-date data, always reference the latest modified_time when retrieving defined object records.

Data Schema

Events Parquet File Schema

The following table describes the schema for the parquet files containing Fullstory named element definitions.

FieldTypeNullableDescription
event_idstringNThe unique identifier for the event.
event_timestringYThe time in UTC that the event occurred, formatted as yyyy-MM-ddTHH:mm:ss.SSSSSSZ.
processed_timestringNThe time in UTC that the event was packaged in the parquet file, formatted as yyyy-MM-ddTHH:mm:ss.SSSSSSZ.
updated_timelongYIf set, the time when the parquet file was recreated represented by the number of milliseconds since the epoch.
device_idlongNThe device ID as defined in the base event model.
session_idlongYThe session ID as defined in the base event model.
view_idlongYThe view ID as defined in the base event model.
event_typestringNThe type of this event.
event_propertiesstringNA json string containing the associated event properties.
source_typestringNThe source type of the event.
source_propertiesstringNA json string containing the associated source properties.

Element Definitions Parquet File Schema

The following table describes the schema for Fullstory named element definitions parquet files.

PropertyTypeDescription
idstringThe named element's ID. Use the ID to join on the element_definition_id.
namestringThe name of the named element.
descriptionstringThe description of the named element.
statestringThe state of the named element (e.g., active, archived).
created_timestringTimestamp of when the named element was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
created_bystringThe Fullstory user that created the named element.
modified_timestringTimestamp of when the named element was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
modified_bystringThe Fullstory user that last modified the named element.

Event Definitions Parquet File Schema

The following table describes the schema for Fullstory defined event definitions parquet files.

PropertyTypeDescription
idstringThe defined event's ID. Use the ID to join on the event_definition_id.
namestringThe name of the defined event.
descriptionstringThe description of the defined event.
statestringThe state of the defined event (e.g., active, archived).
created_timestringTimestamp of when the defined event was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
created_bystringThe Fullstory user that created the defined event.
modified_timestringTimestamp of when the defined event was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
modified_bystringThe Fullstory user that modified the defined event.

Page Definitions Parquet File Schema

The following table describes the schema for Fullstory page definitions parquet files.

PropertyTypeDescription
idstringThe page's ID. Use the ID to join on the page_definition_id.
fs_link_idstringThe string ID that can be used to construct a url in Fullstory app, e.g. https://app.fullstory.com/ui/<org_id>/settings/pages/<link_id>
namestringThe name of the page.
descriptionstringThe description of the page.
is_user_definedbooleanTrue if the page was defined by the user, False if defined by a Fullstory algorithm.
statestringThe state of the page (e.g., active, archived).
created_timestringTimestamp of when the page was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
created_bystringThe Fullstory user that created the page.
modified_timestringTimestamp of when the page was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS
modified_bystringThe Fullstory user that modified the page.

Querying as an External Table

The parquet files containing events can be queried as an external table. For example, an Athena external table can be created using:

CREATE EXTERNAL TABLE IF NOT EXISTS table_name (
event_id string,
event_time string,
processed_time string,
updated_time bigint,
device_id bigint,
session_id bigint,
view_id bigint,
event_type string,
event_properties string,
source_type string,
source_properties string)
PARTITIONED BY(ingested_time string)
STORED AS PARQUET
LOCATION 's3://bucket/prefix/fullstory_<org_id>/events/'
tblproperties ("parquet.compress"="ZSTD");

Before querying, add the partitions using (refer to the Athena documentation for more ways to add partitions):

ALTER TABLE table_name ADD PARTITION (ingested_time='2024-01-01 01:00:00')
LOCATION 's3://bucket/prefix/fullstory_<org_id>/events/ingested_time=2024-01-01 01:00:00/';

The following example shows how to count the number of rage clicks broken down by browser and URL for a single day.

SELECT
json_extract_scalar(source_properties, '$.user_agent.browser') AS browser,
json_extract_scalar(source_properties, '$.url.path') AS path,
COUNT(1) AS rage_clicks
FROM table_name
WHERE
ingested_time BETWEEN '2024-01-01 00:00:00' AND '2024-01-03 00:00:00'
AND event_time >= '2024-01-01T00:00:00.000000Z' AND event_time <= '2024-01-01T99:99:99.999999Z'
AND event_type = 'click'
AND CAST(json_extract_scalar(event_properties, '$.fs_rage_count') AS INTEGER) > 0
GROUP BY
1,
2
ORDER BY
rage_clicks DESC;