Skip to main content

Telemetry Standards

Consistent telemetry practices ensure effective monitoring, troubleshooting, and analysis of application performance and behavior. This guide outlines the standards for implementing telemetry across all applications.

OpenTelemetry Requirements

All applications must use OpenTelemetry as the standard telemetry framework. OpenTelemetry provides a comprehensive set of tools, APIs, and SDKs for collecting, processing, and exporting telemetry data.

Key Requirements

  • Use OpenTelemetry SDK for all instrumentation needs
  • Follow the W3C Trace Context specification for context propagation
  • Configure exporters to send telemetry data to your organization's telemetry backend
  • Implement automatic instrumentation where available
  • Add custom instrumentation for business-critical operations

Signal Types

Implement all three signal types provided by OpenTelemetry:

  1. Traces: To track request flows through distributed systems
  2. Metrics: To measure system performance and behavior
  3. Logs: To capture detailed information about application events

OpenTelemetry Collector

Deploy the OpenTelemetry Collector as an agent or gateway to:

  • Collect telemetry data from multiple services
  • Process and transform data (filtering, batching, etc.)
  • Export data to multiple destinations simultaneously

Logging Best Practices

Naming Conventions

When implementing logging, adhere to these naming conventions:

  • Use PascalCase for Named Placeholders in log messages
  • Apply consistent terminology across all log entries
  • Keep property names descriptive and concise

Value Casing Standards

For telemetry values:

  • Use lowercase invariant for all property values unless casing is required for uniqueness
  • Maintain original casing only when it differentiates identity (e.g., case-sensitive identifiers)
  • Normalize all status codes, error types, and general string values to lowercase

Standard Identifiers

Always include the following standard identifiers in your telemetry for tracking related events across distributed systems:

IdentifierProperty NameDescription
Correlation IDcoaxle.correlation_idTracks a request across multiple services
Party IDcoaxle.party_idIdentifies the user or entity associated with the event
Tenant IDcoaxle.tenant_idIdentifies the tenant context for multi-tenant applications

Implementation Examples

Named Placeholders

Always use named placeholders instead of positional placeholders. Named placeholders improve readability and maintainability of logging code.

// Correct: Using PascalCase named placeholders
logger.LogInformation("User {UserId} accessed resource {ResourceId} with permission level {PermissionLevel}",
userId, resourceId, permissionLevel);

// Incorrect: Using positional placeholders
logger.LogInformation("User {0} accessed resource {1} with permission level {2}",
userId, resourceId, permissionLevel);

// Incorrect: Using non-PascalCase named placeholders
logger.LogInformation("User {userId} accessed resource {resourceId} with permission level {permissionLevel}",
userId, resourceId, permissionLevel);

Using Standard Identifiers

Always include standard identifiers in your telemetry data:

using Microsoft.Extensions.Logging;

public class OrderService
{
private readonly ILogger<OrderService> _logger;

public OrderService(ILogger<OrderService> logger)
{
_logger = logger;
}

public void ProcessOrder(Order order, string correlationId, string partyId, string tenantId)
{
// Add identifiers to logging scope
using (_logger.BeginScope(new Dictionary<string, object>
{
["coaxle.correlation_id"] = correlationId,
["coaxle.party_id"] = partyId,
["coaxle.tenant_id"] = tenantId
}))
{
_logger.LogInformation("Processing order {OrderId} for customer {CustomerId}",
order.Id, order.CustomerId);

// Processing logic here

// Using lowercase for status values
var status = "completed";
_logger.LogInformation("Order {OrderId} processed successfully with {ItemCount} items and status {Status}",
order.Id, order.Items.Count, status);
}
}
}

OpenTelemetry Implementation

// Initialize OpenTelemetry in your application
using OpenTelemetry;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using OpenTelemetry.Metrics;
using OpenTelemetry.Logs;

// In Startup.cs or Program.cs
public void ConfigureServices(IServiceCollection services)
{
// Configure OpenTelemetry with the required resource attributes
services.AddOpenTelemetry()
.ConfigureResource(builder => builder
.AddService("my-service-name", serviceVersion: "1.0.0")
.AddAttributes(new Dictionary<string, object>
{
["deployment.environment"] = "production" // Lowercase invariant value
}))
.WithTracing(builder => builder
// Add instrumentation for ASP.NET Core, HttpClient, etc.
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSqlClientInstrumentation()
// Add custom span processor for correlation IDs
.AddProcessor(new CorrelationIdProcessor("coaxle.correlation_id", "coaxle.party_id", "coaxle.tenant_id"))
.AddOtlpExporter()) // Send to OpenTelemetry collector
.WithMetrics(builder => builder
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddRuntimeInstrumentation()
.AddOtlpExporter());

// Configure OpenTelemetry logging
services.AddLogging(builder =>
{
builder.AddOpenTelemetry(options =>
{
options.AddOtlpExporter();
options.IncludeFormattedMessage = true;
options.IncludeScopes = true;
});
});
}

// Example of creating a custom span
public async Task ProcessOrderWithTracing(Order order, string correlationId, string partyId, string tenantId)
{
var activitySource = new ActivitySource("MyCompany.OrderProcessing");

// Start a new span
using var activity = activitySource.StartActivity("ProcessOrder");

// Add standard identifiers as attributes
activity?.SetTag("coaxle.correlation_id", correlationId);
activity?.SetTag("coaxle.party_id", partyId);
activity?.SetTag("coaxle.tenant_id", tenantId);

// Add business context - unique identifiers keep original casing if needed
activity?.SetTag("order.id", order.Id);
activity?.SetTag("order.customer_id", order.CustomerId);

// Add a status attribute with lowercase invariant value
activity?.SetTag("order.status", "processing");

try
{
// Process the order
await ProcessOrderInternal(order);

// Record success
activity?.SetStatus(ActivityStatusCode.Ok);
activity?.SetTag("order.status", "completed"); // Lowercase invariant value
}
catch (Exception ex)
{
// Record error
activity?.SetStatus(ActivityStatusCode.Error);
activity?.RecordException(ex);
activity?.SetTag("order.status", "failed"); // Lowercase invariant value
throw;
}
}

Configuring OpenTelemetry Collector

Deploy the OpenTelemetry Collector using the following configuration template:

receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318

processors:
batch:
send_batch_size: 1000
timeout: 10s
attributes:
actions:
- key: deployment.environment
action: upsert
value: ${DEPLOYMENT_ENVIRONMENT}
resource:
attributes:
- key: service.namespace
value: "coaxle"
action: upsert

exporters:
otlp:
endpoint: "your-telemetry-backend:4317"
tls:
insecure: false
ca_file: /etc/ssl/certs/ca-certificates.crt
logging:
verbosity: detailed

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, attributes, resource]
exporters: [otlp, logging]
metrics:
receivers: [otlp]
processors: [batch, attributes, resource]
exporters: [otlp, logging]
logs:
receivers: [otlp]
processors: [batch, attributes, resource]
exporters: [otlp, logging]

Log Levels

Use appropriate log levels to categorize the severity and importance of log messages:

LevelWhen to Use
TraceExtremely detailed information, useful only for pinpointing issues in a specific component
DebugDetailed information useful for debugging purposes
InformationGeneral information highlighting the progress of the application
WarningIndicates potential issues or unexpected behavior that doesn't cause application failure
ErrorError conditions that affect a specific operation but not the overall application
CriticalCritical errors that require immediate attention and may cause application failure

Performance Considerations

  • Avoid excessive logging in high-throughput paths
  • Use asynchronous logging where appropriate
  • Consider sampling for high-volume telemetry
  • Implement log rotation and retention policies
  • Monitor the performance impact of telemetry implementations

Sensitive Information

Never log sensitive information such as:

  • Passwords or authentication tokens
  • Personal identifiable information (PII)
  • Financial data
  • Health information
  • Any data protected by privacy regulations

Instead, log references or masked versions of this information when necessary.

OpenTelemetry Semantic Conventions

Adhere to OpenTelemetry semantic conventions for consistent naming across services:

  • Use the OpenTelemetry Semantic Conventions for attribute naming
  • Follow standard naming for error attributes
  • Use standard semantic attributes for HTTP, database, and messaging operations
DomainAttribute PrefixExamples
HTTPhttp.http.method, http.status_code
Databasedb.db.system, db.statement
Messagingmessaging.messaging.system, messaging.destination
Exceptionexception.exception.type, exception.message
Custom Businesscoaxle.coaxle.correlation_id, coaxle.tenant_id

Validation and Testing

  • Verify telemetry data quality with automated tests
  • Implement telemetry validation as part of CI/CD pipelines
  • Set up monitoring alerts for missing or malformed telemetry data
  • Regularly review and optimize telemetry implementations

Front-End Observability with Grafana Faro

This guide will help you integrate Grafana Faro into your web applications for comprehensive front-end observability, including real user monitoring (RUM), error tracking, performance monitoring, and distributed tracing.

Overview

Grafana Faro is a web SDK that instruments frontend JavaScript applications to collect telemetry data and forward it to Grafana Cloud, Grafana Alloy, or a custom receiver. It provides:

  • Real User Monitoring (RUM): Track actual user interactions and performance
  • Error Tracking: Capture unhandled errors and exceptions
  • Performance Monitoring: Collect Web Vitals and performance metrics
  • Distributed Tracing: End-to-end visibility across frontend and backend
  • Custom Events: Track business-specific metrics and user actions

Prerequisites

Before integrating Grafana Faro, you'll need:

  1. Access to Grafana Cloud: Contact your team lead or DevOps team for access to the Grafana Cloud instance
  2. Frontend Application Configuration: Created in the Grafana Cloud account
  3. Collector URL: Obtained from the Grafana Cloud instance

Step 1: Set Up Grafana Cloud Frontend Application

1.1 Access Grafana Cloud

  1. Contact your team lead or DevOps team to get access to the Grafana Cloud instance
  2. Once you have access, log into the Grafana Cloud portal
  3. Navigate to the Grafana Cloud instance

1.2 Create a Frontend Application

Team Coordination

Before creating a new frontend application, check with your team to see if one already exists for your project. Multiple applications may share the same Grafana Cloud frontend application if they're part of the same system.

  1. In the Grafana Cloud instance, navigate to Frontend Observability
  2. Click "Create New"
  3. Configure your application using the company's naming conventions:
    • Application Name: Name of the application to easily identify it
    • Domains: The domains the application is accessible from, this is to allow telemetry to be sent to Grafana
  4. Click "Next"

1.3 Get Your Configuration Details

After creating the application (or getting access to an existing one), you'll need:

  • Faro URL: Your application collector endpoint (e.g., https://faro-collector-company123.grafana.net/collect/{KEY})
  • Application Name: The name configured in the company's Grafana Cloud

Step 2: Choose Your Integration Method

Grafana Faro can be integrated using two main approaches:

Best for modern JavaScript frameworks and build tools.

Option B: CDN Script

Best for simple HTML pages or when you can't modify the build process.


Option A: NPM Integration

A.1 Install the Package

Choose the appropriate package based on your framework:

Vanilla JavaScript/TypeScript

# Using npm
npm install @grafana/faro-web-sdk
npm install @grafana/faro-web-tracing

# Using yarn
yarn add @grafana/faro-web-sdk
yarn add @grafana/faro-web-tracing

A.2 Basic JavaScript/TypeScript Setup


import { getWebInstrumentations, initializeFaro } from '@grafana/faro-web-sdk';
import { TracingInstrumentation } from '@grafana/faro-web-tracing';


initializeFaro({
url: '{FARO_URL}',
app: {
name: '{APP_NAME}',
version: '1.0.0',
environment: 'production'
},

instrumentations: [
// Mandatory, omits default instrumentations otherwise.
...getWebInstrumentations(),

// Tracing package to get end-to-end visibility for HTTP requests.
new TracingInstrumentation(),
],
});

// Faro automatically captures:
// - Unhandled errors and promise rejections
// - Console messages (warn, info, error)
// - Web Vitals (LCP, FID, CLS, etc.)
// - Page views and navigation events

Option B: CDN Integration

B.1 Basic CDN Setup

Add this to your HTML <head> section:

<script>
(function () {
// Create a script tag for loading the library
var script = document.createElement('script');
script.onload = () => {
window.GrafanaFaroWebSdk.initializeFaro({
url: '{FARO_URL}',

app: {
name: '{APP_NAME}',
version: '1.0.0', // Optional, but recommended
},
});
};

// Set the source of the script tag to the CDN
script.src = 'https://unpkg.com/@grafana/faro-web-sdk@^1.0.0-beta/dist/bundle/faro-web-sdk.iife.js';

document.head.appendChild(script);
})();
</script>

B.2 CDN with Tracing Support

See official grafana documentation for more details


Step 3: Verification and Testing

3.1 Verify Data Collection

  1. Check Browser Console: Look for Faro initialization messages
  2. Network Tab: Verify requests to the collector URL
  3. Grafana Cloud: Check the Frontend Observability dashboard for incoming data
  4. Team Coordination: Confirm with your team that data is appearing in the shared dashboards

3.2 Test Error Tracking

// Trigger a test error
throw new Error('Test error for Faro integration');

// Or send a custom error
faro.api.pushError(new Error('Custom test error'));

3.3 Test Custom Events

// Send a test event
faro.api.pushEvent('integration_test', {
timestamp: Date.now(),
source: 'manual_test'
});

Best Practices

  1. Team Coordination: Check with your team before creating new applications - reuse existing ones when appropriate
  2. Company Standards: Follow company naming conventions for applications and environments
  3. Environment Separation: Use different grafana cloud instances for test/uat/production
  4. Sensitive Data: Avoid logging sensitive information (follow company data handling policies)
  5. Performance: Monitor the impact on application performance
  6. Sampling: Implement sampling for high-traffic applications (coordinate with team on data usage limits)
  7. Error Boundaries: Use React Error Boundaries for better error handling
  8. User Privacy: Respect user privacy and comply with company privacy policies and regulations
  9. Secret Management: Use the approved secret management system for API keys

Next Steps

  • Explore the Grafana Cloud Dashboards for visualizing your data
  • Coordinate with your team on Shared Alerts for critical errors or performance thresholds
  • Work with the team to implement Custom Dashboards for business-specific metrics
  • Configure Backend Tracing for end-to-end observability (coordinate with backend team)