Data Export - GCP

Ayla Networks provides two methods for extracting device event data in the Ayla Cloud: the Ayla Data Export feature and DataStream Service (DSS). The Ayla Data Export feature is designed to retrieve historic data, providing developers and data analysts with flexibility for retrieving and consuming historic device event data. Ayla's DataStream Service (DSS) is the other method, which enables customers to use Cloud-scale event ingestion services (AWS Kinesis and Azure Event Hub) to retrieve near real-time event data from the Ayla platform.

Why Use Ayla's Data Export Feature

This feature can assist in the analysis and debugging of various use cases:

  • Troubleshooting inconsistent device behavior over a period of time
  • Identifying usage patterns and scope for improvement
  • Maintaining an archive of device events data for product lifecycle management

This feature also enables customers to access data beyond the current standard data retention policy on the Ayla Customer Dashboard, which is 90 days from the date of capture. The Data Export feature is not intended for accessing real-time data.

Data Export Event Types and Storage

The following device event types are stored on Google Cloud Storage (GCS):

  • Datapoint
  • Datapoint Ack
  • Location
  • Connection
  • Registration

GCS organizes data files into folders for each event type. As new event data is generated, the event data files are posted to GCS in their respective event type folders. High data volumes may result in multiple time-stamped subfolders under the specific event type folders, each containing a single CSV event data file.

AWS to GCP Migration Key Changes

Accessing data files

As part of the AWS to GCP migration, the data export files will be hosted on Google Cloud Storage (GCS) instead of Amazon S3 buckets. The data export files can now be exported using secured API based approach.

Data export interface changes

In GCP, a new cloud service called Ayla Credentials Service is provisioned to provide short-lived access token (TTL: 1 hour) to the client applications. The token can be used by the client application to access and download data files.

Accessing the data files

OEM admin credentials are required to access these files in GCS. Note that this is a major enhancement when compared to AWS S3 implementation and provides secured mechanism for accessing the files.

Data Storage Structure

AWS: there are individual buckets under the path "<OEM_ID>".
GCP: there is a common bucket named "<OEM_ID>", with individual folders under it.

Finding GCS URL

  1. Log in to the Ayla Customer Dashboard.

  2. Click OEM Profile in the navigation panel, and then click the Data Export tab to view the OEM access credentials.

Accessing existing data before the GCP migration

Data export files prior to the GCP migration will be retained in Amazon S3 per our Data Retention policy, and there is no change to their retrieval. You can find more details here

Sample code for accessing GCS data export files

Sample Python code to access and download the data files (CSVs) can be found here. This reference code can be used to create applications or tools in your preferred language to download CSVs.

Steps to download GCP data export files

  1. Sign in using the OEM::Admin credentials to get auth token. E.g.

Request:

curl --location 'https://user-dev.aylanetworks.com/users/sign_in.json'  
--header 'Content-Type: application/json'  
--data-raw '{  
    "user": {  
        "application": {  
            "app_secret": "<app_secret>",  
            "app_id": "<app_id>"  
        },  
        "password": "<password>",  
        "email": "<email>"  
    }  
}

Response:

{
    "access_token": "<access_token>",  
    "refresh_token": "<refresh_token>",  
    "expires_in": "<expires_in>",  
    "role": "<role>",  
    "role_tags": []  
}
  1. Get the bearer token using the Ayla Credential Service (ACS) API (Need OEM::Admin token to call this API).

Request:

curl --location 'https://acs-dev.aylanetworks.com/credservice/v1/dataexports' 
   --header 'Authorization: auth_token <access_token from step 1 response>'

Response:

{  
    "tokenValue": "ya29.dr.ARQiAB8rnx5SPNFDQ34EmG.......",  
    "scopes": [],  
    "expirationTime": "2024-05-21T10:37:03.020+00:00"  
}

NOTE: tokenValue is a short-lived token, use it before the expiration time.

  1. Use the Google Cloud Storage API to list the data files:

Request:

curl --location 'https://storage.googleapis.com/storage/v1/b/><bucket name>/o/?prefix=<oem_id>/'  
--header 'Authorization: Bearer <tokenValue from step 2 response>

Response:

{  
    "kind": "storage#objects",  
    "items": [  
        {  
            "kind": "storage#object",  
            "id": "<bucket name>/<oem_id>/<object path>",  
            "selfLink": "https://www.googleapis.com/storage/v1/b/...",  
            "mediaLink": "https://storage.googleapis.com/download/storage/v1/b/...",  
            "name": "<object path>",  
            "bucket": "<bucket name>",  
            "generation": "1715249838995161",  
            "metageneration": "1",  
            "contentType": "application/octet-stream",  
            "storageClass": "...",  
            "size": "723",  
            "md5Hash": "...",  
            "crc32c": "...",  
            "etag": "...",  
            "timeCreated": "2024-05-09T10:17:18.996Z",  
            "updated": "2024-05-09T10:17:18.996Z",  
            "timeStorageClassUpdated": "2024-05-09T10:17:18.996Z"  
        }  
    ]  
}

NOTE:

  • prefix=\<oem_id>/ is a mandatory parameter.
  • To list the data files available on a specific timestamp (example: 2024-05-09 02:00:00), use the prefix as \<OEM_ID>/<EVENT>/2024-05-09 02:00:00/<FILE>
  • To list the files based on a specific date, use the prefix as \<OEM_ID>/<EVENT>/<Date> (example: \<OEM_ID>/<EVENT>/2024-05-09)
  • Use the tokenValue from the response as the bearer token in the Authorization header.
  1. Download the data files. The data files are downloaded in the format as they are stored in GCS.
curl --location '<mediaLink from step 3 response>'  
--header 'Authorization: Bearer <tokenValue from step 2 response>'