Debug School

Joshica S
Joshica S

Posted on

Datadog Training Day 4 - Assignment (14-09-2023)

1 Top 10 metrics/indicators for APM
i. APM Trace Metrics - Tracing application metrics are collected after enabling trace collection and instrumenting your application
ii. Request Count- trace.<SPAN_NAME>.hits - Represent the count of hits for a given span.
trace.<SPAN_NAME>.hits.by_http_status - Represent the count of hits for a given span break down by HTTP status code.
iii. Latency distribution - trace.<SPAN_NAME> - Represent the latency distribution for all services, resources, and versions across different environments and second primary tags.
iv. Errors - trace..errors - Represent the count of errors for a given span.
trace.<SPAN_NAME>.errors.by_http_status - Represent the count of errors for a given span.
v. Span count - trace..span_count - Represent the amount of spans collected on a given interval.

trace.<SPAN_NAME>.span_count.by_http_status- Represent the amount of spans collected on a given interval break down by HTTP status.
vi. Duration - trace.<SPAN_NAME>.duration - Measure the total time for a collection of spans within a time interval
vii. Duration by - trace.<SPAN_NAME>.duration.by_http_status - Measure the total time for a collection of spans for each HTTP status
trace.<SPAN_NAME>.duration.by_service - Measure the total time spent actually processing for each service
trace.<SPAN_NAME>.duration.by_type - Measure the total time spent actually processing for each Service type.
trace.<SPAN_NAME>.duration.by_type.by_http_status - Measure the total time spent actually processing for each Service type and HTTP status.
trace.<SPAN_NAME>.duration.by_service.by_http_status - Measure the total time spent actually processing for each Service and HTTP status.
viii. Apdex - trace.<SPAN_NAME>.apdex - Measures the Apdex score for each web service.
trace.<SPAN_NAME>.apdex.by.resource_<2ND_PRIM_TAG>_service - Represents the Apdex score for all combination of resources, 2nd primary tags and services.
trace.<SPAN_NAME>.apdex.by.resource_service - Measures the Apdex score for each combination of resources and web services.

trace.<SPAN_NAME>.apdex.by.<2ND_PRIM_TAG>_service - Measures the Apdex score for each combination of 2nd primary tag and web services.
trace.<SPAN_NAME>.apdex.by.service - Measures the Apdex score for each web service.
ix. Runtime metrics - Runtime metrics collection can be enabled with the DD_RUNTIME_METRICS_ENABLED=true environment parameter when running with ddtrace-run.
x. Generate custom metrics from spans - Monitor Service Metrics - For send requests, Errors, Latency percentiles, Analyse database queries or endpoints

  1. Top 10 metrics/indicators for Synthetic monitoring The following metrics are generated by Synthetic Monitoring tests and Continuous Testing settings. a.General Metrics - synthetics.test_runs - The number of Synthetic test runs. b.API tests - synthetics.api.response - The count of API responses we receive. c.HTTP tests - synthetics.http.* come from your API HTTP tests d.SSL tests - synthetics.ssl.* come from your API SSL tests e.DNS tests - synthetics.dns.* come from your API DNS tests f.WebSocket tests - synthetics.websocket.* come from your API WebSocket tests g.TCP tests - synthetics.tcp.* come from your API TCP tests h.UDP tests - synthetics.udp.* come from your API UDP tests i.ICMP tests - synthetics.icmp.* come from your API ICMP tests j.Multistep API tests - synthetics.multi.* come from your multistep API tests k.Browser tests - synthetics.browser.* come from your browser tests l.Private locations - synthetics.pl.come from your private locations Metrics starting with: synthetics.test_runs come from all your Synthetic tests datadog.estimated_usage.synthetics. return relevant usage data from your Synthetic tests synthetics.on_demand return relevant usage data for Continuous Testing

3 Top 10 metrics/indicators for RUM
Datadog Real User Monitoring (RUM) provides end-to-end visibility into the user experience and performance of your browser and mobile applications. RUM allows you to capture and retain complete user sessions for 30 days. This means you can pinpoint bugs, prioritize issues, and determine fixes with data collected across an entire quarter.

  1. User satisfaction and Apdex
  2. FP/ FCP How to Improve FP and FCP
  3. Time to interactive How to Improve Your TTI score
  4. Page speed and load time
  5. Time to first byte How to Improve TTFB
  6. DNS Lookup time
  7. Error Rate
  8. Peak Response Time
  9. Hardware utilization
  10. Uptime +++++++++++++++++++++++++++++END++++++++++++++++++++++++++++++

Top comments (0)