Azure Blob Storage
Sync Expectations
The following table describes the sync behavior of Fullstory events to Azure Blob Storage in parquet files.
Sync Interval | Relevant Timestamps | Setup Guide |
---|---|---|
Approximately processed_time rounded to the next hour + 1 hour | event_time ingested_time processed_time updated_time | Help Doc |
Parquet File Path
The event parquet file path follows the hive partitioning layout (key=value, supported by BigQuery,
etc.) and includes an ingested_time
column that is formatted in UTC and represents the time that the events were ingested by
Fullstory's servers, truncated to seconds. For example:
fullstory_<org_id>/events/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
Note that the ingested time does not imply the range of the event time in a parquet file. Use a larger range when referencing
ingested_time
to make sure the target event time range is included in a query.
Defined Object Parquet File Paths
Incremental changes to any defined objects are stored as new Parquet files in the destination.
The path follows the same hive partitioning layout as the events parquet file path
and includes an ingested_time
column formatted in UTC that indicates the time of the sync, partitioned per day.
The following examples correspond to the defined object parquet files:
fullstory_<org_id>/configuration/element-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
fullstory_<org_id>/configuration/event-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
fullstory_<org_id>/configuration/page-definitions/ingested_time=yyyy-MM-dd HH:mm:ss/<file_name>.parquet
Note that the data in the defined object parquet files may contain duplicates. To ensure you are using the most up-to-date data, always reference the latest
modified_time
when retrieving defined object records.
Data Schema
Events Parquet File Schema
The following table describes the schema for the parquet files containing Fullstory named element definitions.
Field | Type | Nullable | Description |
---|---|---|---|
event_id | string | N | The unique identifier for the event. |
event_time | string | Y | The time in UTC that the event occurred, formatted as yyyy-MM-ddTHH:mm:ss.SSSSSSZ . |
processed_time | string | N | The time in UTC that the event was packaged in the parquet file, formatted as yyyy-MM-ddTHH:mm:ss.SSSSSSZ . |
updated_time | long | Y | If set, the time when the parquet file was recreated represented by the number of milliseconds since the epoch. |
device_id | long | N | The device ID as defined in the base event model. |
session_id | long | Y | The session ID as defined in the base event model. |
view_id | long | Y | The view ID as defined in the base event model. |
event_type | string | N | The type of this event. |
event_properties | string | N | A json string containing the associated event properties. |
source_type | string | N | The source type of the event. |
source_properties | string | N | A json string containing the associated source properties. |
Element Definitions Parquet File Schema
The following table describes the schema for Fullstory named element definitions parquet files.
Property | Type | Description |
---|---|---|
id | string | The named element's ID. Use the ID to join on the element_definition_id . |
name | string | The name of the named element. |
description | string | The description of the named element. |
state | string | The state of the named element (e.g., active, archived). |
created_time | string | Timestamp of when the named element was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS |
created_by | string | The Fullstory user that created the named element. |
modified_time | string | Timestamp of when the named element was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS |
modified_by | string | The Fullstory user that last modified the named element. |
Event Definitions Parquet File Schema
The following table describes the schema for Fullstory defined event definitions parquet files.
Property | Type | Description |
---|---|---|
id | string | The defined event's ID. Use the ID to join on the event_definition_id . |
name | string | The name of the defined event. |
description | string | The description of the defined event. |
state | string | The state of the defined event (e.g., active, archived). |
created_time | string | Timestamp of when the defined event was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS |
created_by | string | The Fullstory user that created the defined event. |
modified_time | string | Timestamp of when the defined event was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS |
modified_by | string | The Fullstory user that modified the defined event. |
Page Definitions Parquet File Schema
The following table describes the schema for Fullstory page definitions parquet files.
Property | Type | Description |
---|---|---|
id | string | The page's ID. Use the ID to join on the page_definition_id . |
fs_link_id | string | The string ID that can be used to construct a url in Fullstory app, e.g. https://app.fullstory.com/ui/<org_id>/settings/pages/<link_id> |
name | string | The name of the page. |
description | string | The description of the page. |
is_user_defined | boolean | True if the page was defined by the user, False if defined by a Fullstory algorithm. |
state | string | The state of the page (e.g., active, archived). |
created_time | string | Timestamp of when the page was created. Format: YYYY-MM-DD HH:mm:ss.SSSSSS |
created_by | string | The Fullstory user that created the page. |
modified_time | string | Timestamp of when the page was last modified. Format: YYYY-MM-DD HH:mm:ss.SSSSSS |
modified_by | string | The Fullstory user that modified the page. |
Transforming Events data for SQL Queries
Note: Azure provides several methods for transforming data stored in Azure Blob Storage. The following example demonstrates how to transform the data using Azure Synapse Analytics.
Azure Synapse Analytics
Azure Synapse Analytics is a cloud-based data integration service that allows you to transform data stored in Azure Blob Storage by setting up a source (Azure Blob Storage files) and a sink (a SQL table in this case).
- Create a new Synapse workspace in the Azure portal.
- Create a new Data Flow in the Synapse workspace.
- Set a source integration dataset pointing to the Azure Blob Storage files.
- Set a sink integration dataset pointing to a desired SQL table.
- Set up a pipeline to run the Data Flow and transform the data to be used in your sink table.
Querying the Events Table
To demonstrate how to query the events table, the following example shows how to count the number of rage clicks broken down by browser and URL. For demonstration purposes, we include Microsoft SQL Server syntax, although this should be adapted for the specific sink you are using.
SELECT
JSON_VALUE(source_properties, '$.user_agent.browser') AS browser,
JSON_VALUE(source_properties, '$.url.path') AS path,
COUNT(1) AS rage_clicks
FROM
<tablename> -- Update the table reference based on what was provided as sink table during transformation.
WHERE
event_type = 'click'
AND CAST(JSON_VALUE(event_properties, '$.fs_rage_count') AS INT) > 0
GROUP BY
JSON_VALUE(source_properties, '$.user_agent.browser'),
JSON_VALUE(source_properties, '$.url.path')
ORDER BY
rage_clicks DESC;