Skip to content

Latest commit

 

History

History
344 lines (270 loc) · 19.6 KB

File metadata and controls

344 lines (270 loc) · 19.6 KB
linkTitle Data reference
title Data management reference
weight 999
layout docs
type docs
description Data schema, query operators, configuration fields, supported resources, and storage behavior.
aliases
/data/query-reference/
/manage/data/query/
/use-cases/sensor-data-query/
/use-cases/sensor-data-query-with-third-party-tools/
/how-tos/sensor-data-query-with-third-party-tools/
/data-ai/data/query/
/data/advanced-data-capture-sync/
/data-ai/capture-data/advanced/advanced-data-capture-sync/
/data-ai/capture-data/advanced/
/data/query/query-reference/
/data/capture-sync/advanced-data-capture-sync/
date 2025-02-10

Data schema

For the full schema of the readings table (document format, column reference, the data column, and per-component data structures), see Captured data schema.

Query reference

{{% alert title="Note" color="note" %}} Tabular data queries (TabularDataByMQL and TabularDataBySQL) share a 100 TB monthly processing limit across your organization. Queries that exceed the limit return an error. Contact us to request an increase. {{% /alert %}}

Indexed fields and query optimization

You can improve query performance by filtering on indexed fields early in your query. Viam stores data in blob storage using the path pattern:

/organization_id/location_id/robot_id/part_id/component_type/component_name/method_name/capture_day/*

The more specific you can be, starting with the beginning of the path, the faster your query. These fields are indexed:

  • organization_id
  • location_id
  • robot_id
  • part_id
  • component_type
  • component_name
  • method_name
  • capture_day

Additional optimization techniques:

  • Filter and reduce data early. Use $match (MQL) or WHERE (SQL) before expensive operations like grouping or sorting.
  • Use $project early to drop unneeded fields from the processing pipeline.
  • Use $limit or LIMIT while developing queries to avoid scanning your entire dataset.
  • For frequent queries on recent data, use the hot data store.
  • For recurring queries (dashboards), use data pipelines to pre-compute materialized views.

Supported MQL operators

Viam supports a subset of MongoDB aggregation pipeline stages. Operators not on this list will return an error.

  • $addFields
  • $bucket
  • $bucketAuto
  • $count
  • $densify
  • $fill
  • $geoNear
  • $group
  • $limit
  • $match
  • $project
  • $redact
  • $replaceRoot
  • $replaceWith
  • $sample
  • $set
  • $setWindowFields
  • $skip
  • $sort
  • $sortByCount
  • $unset
  • $unwind

See the MQL documentation for syntax details.

SQL limitations

Viam supports the MongoDB Atlas SQL dialect:

  • If a database, table, or column identifier begins with a digit, a reserved character, or conflicts with a reserved SQL keyword, surround it with backticks (`) or double quotes (").
  • To include a single quote in a string literal, use two single quotes (use o''clock to represent o'clock).
  • The date data type is not supported. Use timestamp instead.

For a full list of limitations, see the MongoDB Atlas SQL Interface Language Reference.

Date queries

MQL time-range queries perform better with the BSON date type than with $toDate. Use JavaScript Date() constructors in mongosh:

use sensorData

const startTime = new Date('2024-02-10T19:45:07.000Z')
const endTime = new Date()

db.readings.aggregate([
    { $match: {
        time_received: {
            $gte: startTime,
            $lte: endTime
        }
    }}
])

Permissions

Users with owner or operator roles at the organization, location, or machine level can query data. See Role-Based Access Control for details.

Supported resources

The following components and services support data capture and cloud sync. The table shows which capture methods are available for each resource type. Not all models support all methods listed for their type.

{{< readfile "/static/include/data/capture-supported.md" >}}

If the resource type you need is not listed, you can still capture data from it using the DoCommand method with a custom docommand_input parameter.

Capture and sync configuration

This section describes the configuration fields for data capture and cloud sync.

Data management service attributes

The data management service controls sync behavior, storage paths, and deletion policies. Most of these settings are configured through the Viam app UI. Edit JSON directly for settings not exposed in the UI, such as deletion thresholds, sync thread limits, and MongoDB capture.

{{< tabs >}} {{% tab name="viam-server" %}}

{
  "services": [
    {
      "name": "my-data-manager",
      "api": "rdk:service:data_manager",
      "model": "rdk:builtin:builtin",
      "attributes": {
        "sync_interval_mins": 1,
        "capture_dir": "",
        "tags": [],
        "capture_disabled": false,
        "sync_disabled": true,
        "delete_every_nth_when_disk_full": 5,
        "maximum_num_sync_threads": 250
      }
    }
  ]
}

{{% /tab %}} {{% tab name="viam-micro-server" %}}

{
  "services": [
    {
      "name": "my-data-manager",
      "api": "rdk:service:data_manager",
      "model": "rdk:builtin:builtin",
      "attributes": {
        "capture_dir": "",
        "tags": [],
        "additional_sync_paths": [],
        "sync_interval_mins": 3
      }
    }
  ]
}

{{% /tab %}} {{< /tabs >}}

Name Type Required? Description viam-micro-server Support
capture_disabled bool Optional Toggle data capture on or off for the entire machine {{< glossary_tooltip term_id="part" text="part" >}}. Even if capture is enabled for the whole part, data is only captured from components that have capture individually configured.
Default: false

capture_dir string Optional Path to the directory where captured data is stored. If you change this, only new data goes to the new directory; existing data stays where it was.
Default: ~/.viam/capture

tags array of strings Optional Tags applied to all data captured by this machine part. May include alphanumeric characters, underscores, and dashes.
sync_disabled bool Optional Toggle cloud sync on or off for the entire machine {{< glossary_tooltip term_id="part" text="part" >}}.
Default: false
additional_sync_paths string array Optional Additional directories to sync to the cloud. Data is deleted from the directory after syncing. Use absolute paths.
sync_interval_mins float Optional Minutes between sync attempts. Your hardware or network speed may impose practical limits.
Default: 0.1 (every 6 seconds).

selective_syncer_name string Optional Name of the sensor that controls selective sync. Also add this sensor to the depends_on field. See Conditional sync.
delete_every_nth_when_disk_full int Optional When local storage meets the fullness criteria, the service deletes every Nth captured file.
Default: 5
maximum_num_sync_threads int Optional Max CPU threads for syncing to the cloud. Higher values may improve throughput but can cause instability on constrained devices.
Default: runtime.NumCPU/2
mongo_capture_config.uri string Optional MongoDB URI for writing tabular data alongside disk capture.
mongo_capture_config.database string Optional Database name for MongoDB capture.
Default: "sensorData"
mongo_capture_config.collection string Optional Collection name for MongoDB capture.
Default: "readings"
maximum_capture_file_size_bytes int Optional Maximum size in bytes of each capture file on disk. When a capture file reaches this size, a new file is created.
Default: 262144 (256 KB)
file_last_modified_millis int Optional How long (in ms) an arbitrary file must be unmodified before it is eligible for sync. Normal .capture files sync immediately.
Default: 10000
disk_usage_deletion_threshold float Optional Disk usage ratio (0-1) at or above which captured files are deleted, provided the capture directory also meets capture_dir_deletion_threshold.
Default: 0.9
capture_dir_deletion_threshold float Optional Ratio (0-1) of disk usage attributable to the capture directory, at or above which deletion occurs (if disk_usage_deletion_threshold is also met).
Default: 0.5

Platform-managed service settings

The following settings appear in your machine's configuration but are not processed by viam-server on your machine. They are read and enforced by the Viam cloud platform:

Name Type Description
delete_data_on_part_deletion bool Whether deleting this machine or machine part also deletes all its captured cloud data. Default: false.

Data capture method attributes

Data capture is configured per-resource in the service_configs array of a component or service. When you configure capture through the Viam app UI, these fields are set automatically. The table below is the JSON-level reference for manual configuration.

Here is where capture attributes live in a component's JSON config:

{
  "name": "my-sensor",
  "api": "rdk:component:sensor",
  "model": "rdk:builtin:fake",
  "service_configs": [
    {
      "type": "data_manager",
      "attributes": {
        "capture_methods": [
          {
            "method": "Readings",
            "capture_frequency_hz": 0.2,
            "additional_params": {},
            "disabled": false
          }
        ]
      }
    }
  ]
}

{{< alert title="Caution" color="caution" >}} Avoid configuring capture rates higher than your hardware can handle. This leads to performance degradation. {{< /alert >}}

Name Type Required? Description
name string Required Fully qualified resource name (for example, rdk:component:sensor/my-sensor).
method string Required Depends on the component or service type. See Supported resources. Individual tabular readings larger than 4 MB are rejected at upload time and will not sync to the cloud.
capture_frequency_hz float Required Frequency in hertz. For example, 0.5 = one reading every 2 seconds.
additional_params object Optional Method-specific parameters. For example, DoCommand requires a docommand_input object; GetImages accepts a filter_source_names list.
disabled boolean Optional Whether capture is disabled for this method.
tags array of strings Optional Tags applied to data captured by this specific method. Added alongside any tags set at the service level.
capture_directory string Optional Override the capture directory for this specific resource. If not set, uses the service-level capture_dir.
capture_queue_size int Optional Size of the buffer between the capture goroutine and the file writer.
Default: 250
capture_buffer_size int Optional Size in bytes of the buffer used when writing captured data to disk.
Default: 4096
cache_size_kb float Optional viam-micro-server only. Max storage (KB) per data collector.
Default: 1

Platform-managed capture settings

The following settings are processed by the Viam cloud platform, not by viam-server. retention_policy is set at the attributes level (sibling to capture_methods), while recent_data_store is set inside an individual capture_methods[] entry:

{
  "type": "data_manager",
  "attributes": {
    "capture_methods": [
      {
        "method": "Readings",
        "capture_frequency_hz": 0.2,
        "recent_data_store": {
          "stored_hours": 24
        }
      }
    ],
    "retention_policy": {
      "days": 30
    }
  }
}
Name Type Level Description
retention_policy object attributes (sibling to capture_methods) How long captured data is retained in the cloud. Options: "days": <int>, "binary_limit_gb": <int>, "tabular_limit_gb": <int>. Days are in UTC.
recent_data_store object Inside a capture_methods[] entry Store a rolling window of recent data in a hot data store for faster queries. Example: { "stored_hours": 24 }

For remote parts capture, see Capture from multi-part machines.

Local storage

This section describes how captured data is stored on the machine before syncing.

Capture directories

By default, captured data is stored in ~/.viam/capture. The actual path depends on your platform:

Platform Default directory
Windows With viam-agent: C:\Windows\system32\config\systemprofile.viam\capture. Manual installation: C:\Users\admin.viam\capture.
Linux With root or sudo: /root/.viam/capture.
macOS /Users/<username>/.viam/capture.

{{% expand "Can't find the capture directory?" %}}

The path depends on where viam-server runs and the operating system. Check your machine's startup logs for the $HOME value:

2025-01-15T14:27:26.073Z    INFO    rdk    server/entrypoint.go:77    Starting viam-server with following environment variables    {"HOME":"/home/johnsmith"}

{{% /expand%}}

You can change the capture directory with the capture_dir attribute in the data management service attributes.

Automatic deletion

After data syncs successfully, it is automatically deleted from local storage. While a machine is offline, captured data accumulates locally.

{{< alert title="Warning" color="warning" >}} If your machine is offline and its disk fills up, the data management service will delete captured data to free space and keep the machine running. {{< /alert >}}

Automatic deletion triggers when all of these conditions are met:

  • Data capture is enabled
  • Local disk usage is at or above the disk_usage_deletion_threshold (default: 90%)
  • The capture directory accounts for at least the capture_dir_deletion_threshold proportion of disk usage (default: 50%)

Control deletion behavior with the delete_every_nth_when_disk_full attribute.

Micro-RDK

The micro-RDK (for ESP32 and similar microcontrollers) supports data capture with a smaller set of resources than viam-server. See the Micro-RDK tab in the supported resources table for the specific methods available.

On micro-RDK devices, captured data is stored in the ESP32's flash memory until it is uploaded to the cloud. If the machine restarts before all data is synced, unsynced data since the last sync point is lost.