stork issueshttps://gitlab.isc.org/isc-projects/stork/-/issues2024-03-05T15:02:12Zhttps://gitlab.isc.org/isc-projects/stork/-/issues/1322Upgrade Grafana and its dashboards2024-03-05T15:02:12ZSlawek FigielUpgrade Grafana and its dashboardsYou cannot import the example Grafana dashboards to a modern Grafana instance.
![image](/uploads/a95f33a1aa876ae18d92057fd933fc5f/image.png)
The Stork demo uses the `8.3.7` Grafana version. The latest version is `10.3.3`.
We should up...You cannot import the example Grafana dashboards to a modern Grafana instance.
![image](/uploads/a95f33a1aa876ae18d92057fd933fc5f/image.png)
The Stork demo uses the `8.3.7` Grafana version. The latest version is `10.3.3`.
We should upgrade the Grafana (and maybe Prometheus) used in the demo and migrate the example dashboards to a modern format.1.16https://gitlab.isc.org/isc-projects/stork/-/issues/1214Shared network address utilization not consistent2024-02-27T10:03:39ZVictor PetrescuShared network address utilization not consistentHi everyone,
I've encounter an issue related to the values of the Shared Network Address Utilization. It seems that the values from the /metrics of the Stork Server frequently not showing same values as in the Stork Web Application.
Fo...Hi everyone,
I've encounter an issue related to the values of the Shared Network Address Utilization. It seems that the values from the /metrics of the Stork Server frequently not showing same values as in the Stork Web Application.
For example:
Information from Stork Web App:
![Screenshot_1205](/uploads/0d06970d5bdb83da790e7521dcde773e/Screenshot_1205.png)
Information from Stork Server /metrics:
storkserver_shared_network_address_utilization{name="1"} 0.005
storkserver_shared_network_address_utilization{name="2"} 0.154
storkserver_shared_network_address_utilization{name="3"} 0.004
storkserver_shared_network_address_utilization{name="4"} 0.003
storkserver_shared_network_address_utilization{name="5"} 0.003
storkserver_shared_network_address_utilization{name="6"} 0
As you can see the values don't match. Strange is that sometimes values do match.
Stork Web App is showing the correct values, the problem is with the ones from /metrics.
Thank you !1.16https://gitlab.isc.org/isc-projects/stork/-/issues/1323Attach more labels to the Prometheus samples2024-03-19T14:58:05ZSlawek FigielAttach more labels to the Prometheus samplesCurrently, if you haven't the subnet_cmds hook, the metrics are labeled with the subnet ID, and if you have the subnet_cmds hook, the metrics are labeled with the subnet name prefix if provided, otherwise with the subnet ID.
Our custome...Currently, if you haven't the subnet_cmds hook, the metrics are labeled with the subnet ID, and if you have the subnet_cmds hook, the metrics are labeled with the subnet name prefix if provided, otherwise with the subnet ID.
Our customer needs samples labeled by subnet ID regardless of the subnet_cmds presence. It would also be helpful to attach the shared network name.
[SF#1762](https://isc.lightning.force.com/lightning/r/Case/500S6000006AQSSIA4/view)1.17https://gitlab.isc.org/isc-projects/stork/-/issues/1315Prometheus exporters should fetch the data on demand2024-03-05T14:56:38ZSlawek FigielPrometheus exporters should fetch the data on demandCurrently, we periodically aggregate the metrics for Prometheus purposes using internal Stork intervals. It means the Prometheus collector always pulls a bit of outdated cached values.
We should avoid using the internal collecting loop ...Currently, we periodically aggregate the metrics for Prometheus purposes using internal Stork intervals. It means the Prometheus collector always pulls a bit of outdated cached values.
We should avoid using the internal collecting loop and our own intervals. The Prometheus collector should be the one to decide when and how often the data are fetched.
- [ ] The Stork agent should send the `statistics-get-all` command and process the response on request to the `/metrics` endpoint
- [ ] The Stork server should retrieve the adjusted utilizations and machine counters on request to the `/metrics` endpointbackloghttps://gitlab.isc.org/isc-projects/stork/-/issues/1177subnet stats should display a warning message if they are out of date2023-10-10T13:29:01ZRazvan Becheriusubnet stats should display a warning message if they are out of datelast time the stats have been received is stored in the database, so if stats are older than `current time - fetch time interval`, a warning should be displayed that stats are out of datelast time the stats have been received is stored in the database, so if stats are older than `current time - fetch time interval`, a warning should be displayed that stats are out of datebackloghttps://gitlab.isc.org/isc-projects/stork/-/issues/1153add pool statistics to prometheus2023-09-12T13:48:16ZRazvan Becheriuadd pool statistics to prometheusnew stats have been added in 2.4.0:
v4:
```
"subnet[1].pool[0].assigned-addresses": [
[
0,
"2023-06-13 20:42:46.836205"
]
],
"subnet[1...new stats have been added in 2.4.0:
v4:
```
"subnet[1].pool[0].assigned-addresses": [
[
0,
"2023-06-13 20:42:46.836205"
]
],
"subnet[1].pool[0].cumulative-assigned-addresses": [
[
0,
"2023-06-13 20:42:46.836137"
]
],
"subnet[1].pool[0].declined-addresses": [
[
0,
"2023-06-13 20:42:46.836213"
]
],
"subnet[1].pool[0].reclaimed-declined-addresses": [
[
0,
"2023-06-13 20:42:46.836225"
]
],
"subnet[1].pool[0].reclaimed-leases": [
[
0,
"2023-06-13 20:42:46.836236"
]
],
"subnet[1].pool[0].total-addresses": [
[
11010049,
"2023-06-13 20:42:46.836128"
]
],
```
v6:
```
"subnet[1].pd-pool[0].assigned-pds": [
[
0,
"2023-06-13 21:28:57.196785"
]
],
"subnet[1].pd-pool[0].cumulative-assigned-pds": [
[
0,
"2023-06-13 21:28:57.196744"
]
],
"subnet[1].pd-pool[0].reclaimed-leases": [
[
0,
"2023-06-13 21:28:57.196789"
]
],
"subnet[1].pd-pool[0].total-pds": [
[
256,
"2023-06-13 21:28:57.196741"
]
],
"subnet[1].pool[0].assigned-nas": [
[
0,
"2023-06-13 21:28:57.196773"
]
],
"subnet[1].pool[0].cumulative-assigned-nas": [
[
0,
"2023-06-13 21:28:57.196739"
]
],
"subnet[1].pool[0].declined-addresses": [
[
0,
"2023-06-13 21:28:57.196775"
]
],
"subnet[1].pool[0].reclaimed-declined-addresses": [
[
0,
"2023-06-13 21:28:57.196779"
]
],
"subnet[1].pool[0].reclaimed-leases": [
[
0,
"2023-06-13 21:28:57.196783"
]
],
"subnet[1].pool[0].total-nas": [
[
281474976710656,
"2023-06-13 21:28:57.196736"
]
],
```
```backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/834NaN metrics values2023-01-30T13:13:47ZSlawek FigielNaN metrics valuesReported by @ray - [Source](https://mattermost.isc.org/isc/pl/n5hqa4gzmigjj87p6c4exbs8zc)
I don't yet have a Prometheus server polling this, but the NaN from the raw /metrics pull here seems wrong:
```
# TYPE bind_traffic_incoming_requ...Reported by @ray - [Source](https://mattermost.isc.org/isc/pl/n5hqa4gzmigjj87p6c4exbs8zc)
I don't yet have a Prometheus server polling this, but the NaN from the raw /metrics pull here seems wrong:
```
# TYPE bind_traffic_incoming_requests_udp4_size histogram
bind_traffic_incoming_requests_udp4_size_bucket{le="47"} 2
bind_traffic_incoming_requests_udp4_size_bucket{le="+Inf"} 2
bind_traffic_incoming_requests_udp4_size_sum NaN
bind_traffic_incoming_requests_udp4_size_count 2
```backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/254Update templates for new Kea cumulative assigned statistics2022-11-16T11:54:50ZFrancis DupontUpdate templates for new Kea cumulative assigned statisticsKea 1.7.7 now has a new statistic that shows an ever increasing number of assigned addresses. The customer wanted to observe how many devices were provisioned in his network each day (he'll be resetting this every midnight).Kea 1.7.7 now has a new statistic that shows an ever increasing number of assigned addresses. The customer wanted to observe how many devices were provisioned in his network each day (he'll be resetting this every midnight).backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/191UI: The pool utilization stats should be refreshed2022-11-16T11:54:50ZTomek MrugalskiUI: The pool utilization stats should be refreshedAlthough the kea->agent->server stats are being refreshed every 10 seconds, the UI is not. Right now user has to hit F5 or cltr-R every time to see updated stats. This is something that should be done automatically.
Eventually, this sho...Although the kea->agent->server stats are being refreshed every 10 seconds, the UI is not. Right now user has to hit F5 or cltr-R every time to see updated stats. This is something that should be done automatically.
Eventually, this should be configurable interval, but for the time being having it set to 10s seems reasonable. As for the long term plans, I like what Grafana does. You have a list of intervals you can choose from (1s, 5s, 10s, 30s, 1min, 10min etc). During normal operation you'd probably have it set up to 30s or a minute. However, during emergencies 1s could be useful. (I understand that any value below 10s is not exactly usable right now as the underlying pulling data from the servers updates every 10s, but that too will be configurable one day).backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/46Req 2.3 - Kea Degradation Canary2023-07-25T13:39:26ZVicky Riskvicky@isc.orgReq 2.3 - Kea Degradation CanaryAs an administrator, I need a clear visual indicator when a Kea server/service is becoming overloaded. This alerts me that I need to take some action to prevent further degradation or failure of the service.
As an administrator, if this...As an administrator, I need a clear visual indicator when a Kea server/service is becoming overloaded. This alerts me that I need to take some action to prevent further degradation or failure of the service.
As an administrator, if this alarm occurs frequently I would like to be able to customize the level that constitutes an alarming value.
If there is a separate panel of alerts or logged events, I would expect to see these threshold-crossing alarms included there.
It would be ideal if this is available without requiring that I install Grafana or Prometheus, as I may have a small deployment of one or two servers.
possible use cases:
- increasing `secs` reported by clients
- users with external lease db, query to see how long it takes the db to do a select to see if the db itself, or the connection to the db is degraded
- any sort of statistics about the ring buffer, to alert when the buffer is growing excessively (this might be possible with the Stork agent but not with Kea)
- something that could help people detect conflicts when they are running multiple Keas with the same address range, using a shared lease db, because these can also lead to cascading performance issues
Details
* We will need to decide what metric or combination of metrics to base this alarm condition on.
* We discussed the fact that increasing delay in responding to client requests might be an indicator of a service degradation and a leading indicator of Kea server failure.backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/1335Not possible to run DHCPv6 only and collect stats with Prometheus2024-03-25T17:24:07ZRinse KloekNot possible to run DHCPv6 only and collect stats with Prometheus - Kea version: 2.4.1
- Stork: which version? 1.15.0
- OS: Debian 11
I have an issue with the prometheus exporter. I had enabled DHCPv4/DHCPv6/D2 in my kea-ctrl-agent as well as installed the KEA DHCPv4 server. However this server on... - Kea version: 2.4.1
- Stork: which version? 1.15.0
- OS: Debian 11
I have an issue with the prometheus exporter. I had enabled DHCPv4/DHCPv6/D2 in my kea-ctrl-agent as well as installed the KEA DHCPv4 server. However this server only serves DHCPv6 request. If I remove the DHCPv6/D2 entries in the Kea-ctrl-agent.conf and deinstall KEA DHCPv4 server, I am not able to see my DHCPv6 PD stats (kea_dhcp6_pd_assigned_total) anymore.
I get this error message in the log on the server
stork-agent[929757]: time="2024-03-19 11:54:23" level="error" msg="Problem parsing DHCPv4 labels from Kea: problem with content of DHCP labels response from Kea: forwarding socket is not configured for the server type dhcp4\nisc.org/stork/agent.(*SubnetList).UnmarshalJSON\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:82\nencoding/json.(*decodeState).array\n\t/builds/isc-projects/stork/tools/golang/go/src/encoding/json/decode.go:507\nencoding/json.(*decodeState).value\n\t/builds/isc-projects/stork/tools/golang/go/src/encoding/json/decode.go:364\nencoding/json.(*decodeState).unmarshal\n\t/builds/isc-projects/stork/tools/golang/go/src/encoding/json/decode.go:181\nencoding/json.Unmarshal\n\t/builds/isc-projects/stork/tools/golang/go/src/encoding/json/decode.go:108\nisc.org/stork/agent.(*lazySubnetNameLookup).fetchAndCacheNames\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:274\nisc.org/stork/agent.(*lazySubnetNameLookup).getName\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:291\nisc.org/stork/agent.(*lazySubnetNameLookup).getNameOrDefault\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:303\nisc.org/stork/agent.(*PromKeaExporter).setDaemonStats\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:772\nisc.org/stork/agent.(*PromKeaExporter).collectStats\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:877\nisc.org/stork/agent.(*PromKeaExporter).statsCollectorLoop\n\t/builds/isc-projects/stork/backend/agent/promkeaexporter.go:724\nruntime.goexit\n\t/builds/isc-projects/stork/tools/golang/go/src/runtime/asm_amd64.s:1650" file=" promkeaexporter.go:276 "
`
After installing the KEA DHCPv4 server again and reeneabling the KEA DHCPv4 server in kea-ctrl-agent these log message don't appear anymore and the Prometheus stats collection works again.
Is it possible to remove the DHCPv4 server and keep the stats working for DHCPv6
regads,
Rinsehttps://gitlab.isc.org/isc-projects/stork/-/issues/833Avoid creating separate metrics per transport2022-10-25T13:36:55ZSlawek FigielAvoid creating separate metrics per transportReported by @ray - [Source](https://mattermost.isc.org/isc/pl/n5hqa4gzmigjj87p6c4exbs8zc):
> \# TYPE bind_traffic_incoming_requests_udp4_size histogram
> bind_traffic_incoming_requests_udp4_size_bucket{le="47"} 2
> bind_traffic_inco...Reported by @ray - [Source](https://mattermost.isc.org/isc/pl/n5hqa4gzmigjj87p6c4exbs8zc):
> \# TYPE bind_traffic_incoming_requests_udp4_size histogram
> bind_traffic_incoming_requests_udp4_size_bucket{le="47"} 2
> bind_traffic_incoming_requests_udp4_size_bucket{le="+Inf"} 2
> bind_traffic_incoming_requests_udp4_size_sum NaN
> bind_traffic_incoming_requests_udp4_size_count 2
I'd also like to suggest that the udp4 part of these metrics should be a label e.g. {transport="udp4" } and not separate metrics per transport
the rationale is that as an operator, I want to be able to graph these things (udp4, tcp6, etc) in aggregate, and also in isolation, and IIUC, labels are the Prometheus way to accomplish thatoutstanding