InfluxDB - Marcus Noll

## Primer - InfluxDB is a database for storing a (large) series of time-stamped data points with the aim to - for example - plot a graph from that - This kind of database is called a "[time series](https://grafana.com/docs/grafana/latest/basics/timeseries/) database" - Of the explanations I found on time series databases, nearly all of them talk about measuring numeric values (e.g. response times or temperatures). But it is totally possible to store logs or error messages in InfluxDB - InfluxDB combines two data points if they have the same timestamp and field value - Data points can be tagged ## Whether to store data as field or as tag A tricky question seems to be what to store as **field** and what to store as **tag**. There are quite a few discussions about that across the web. The way I think of the **field** vs. **tag** question is this: Tags help the system to filter out subsets from a huge pile of data points. The more you can filter out by using tags, the faster the query for the information you are actually after. Each tag should only have a small number of possible values. With several of such tags combined, the filtering process remains cheap and helps reduce the number of field values that have to be queried in the end. Data that has more variability (e.g. unique IDs) should not be stored as tag, because that would slow down writing to and reading from the database. See also [Resolve high series cardinality](https://docs.influxdata.com/influxdb/v2.5/write-data/best-practices/resolve-high-cardinality/#count-unique-tag-values) in the InfluxDB docs. ## Aggregation > Depending on what you’re measuring, the data can vary greatly. What if you wanted to compare periods longer than the interval between measurements? If you’d measure the temperature once every hour, you’d end up with 24 data points per day. To compare the temperature in August over the years, you’d have to combine the 31 times 24 data points into one. > [...] > How you choose to aggregate your time series data is an important decision and depends on the story you want to tell with your data. It’s common to use different aggregations to visualize the same time series data in different ways. -- [Source](https://grafana.com/docs/grafana/latest/basics/timeseries/#aggregating-time-series) Use [`aggregateWindow()`](https://docs.influxdata.com/influxdb/cloud/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/) to control the aggregation behavior. It is for example possible to define the time span and a function that is executed on the data inside the window (calculate average, count occurrences, generate sum of occurrences, ...) ## Flux queries Example: ``` from(bucket: "bucket-name") |> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["_measurement"] == "Measurement Name") |> filter(fn: (r) => r["Tag name"] == "value") |> filter(fn: (r) => r["Another tag name"] == "value") |> filter(fn: (r) => r._field == "Field name") |> filter(fn: (r) => r._value =~ /the value you are looking for (Regex in this case)/) |> aggregateWindow(every: v.windowPeriod, fn: last, createEmpty: false) ``` ### About the `_start`, `_stop` and `_time` fields in result tables `_start` and `_stop` in the result tables is the time that is defined in `|> range(start: v.timeRangeStart, stop: v.timeRangeStop)`. And `_time` is the timestamp of the data point. ### Ungroup/flatten tables... ... by calling `group()` without arguments. ## New buckets cannot be assigned to an existing token Instead, a new token has to be created. ## Visualize InfluxDB data with Grafana Make sure you [configure Grafana](https://docs.influxdata.com/influxdb/v1.8/tools/grafana/) to use the InfluxDB query language you are writing your queries in (either InfluxQL or Flux). Otherwise, you may experience queries not giving you the expected results. ## Documentation entry points and helpful articles - [InfluxDB data elements](https://docs.influxdata.com/influxdb/cloud/reference/key-concepts/data-elements/) - [InfluxDB schema design](https://docs.influxdata.com/influxdb/cloud/write-data/best-practices/schema-design/) - [What is the criteria to use multiple fields per measurement?](https://community.influxdata.com/t/what-is-the-criteria-to-use-multiple-fields-per-measurement/2174) - [What’s the logical connection between buckets, measurements & retention policies in InfluxDB 2.0?](https://community.influxdata.com/t/whats-the-logical-connection-between-buckets-measurements-retention-policies-in-influxdb-2-0/15900) - [Handle duplicate data points](https://docs.influxdata.com/influxdb/v2.0/write-data/best-practices/duplicate-points/) - [Resolve high series cardinality](https://docs.influxdata.com/influxdb/cloud/write-data/best-practices/resolve-high-cardinality/) - [Learn the basics about InfluxDB queries with Flux](https://docs.influxdata.com/influxdb/v2.0/query-data/get-started/) in the [Data Explorer](https://docs.influxdata.com/influxdb/v2.0/query-data/execute-queries/data-explorer/) - [Top 5 Hurdles for Flux Beginners and Resources for Learning to Use Flux](https://dganais.medium.com/are-you-new-to-influxdb-v2-0-33c11f47099c)