Skip to content

AzureKVP Telemetry Library for Hyper-V Integration #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Feb 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
2b1ded8
Squash merge for adding KVP functionality via tracing and repo instru…
peytonr18 Oct 8, 2024
60af377
Refactoring KVP PR to move kvp.rs into main libazureinit crate, combi…
peytonr18 Nov 12, 2024
f119cc5
Update tracing setup for compatibility with OpenTelemetry 0.27.0
peytonr18 Nov 12, 2024
798ab8a
Merging in latest changes from main.
peytonr18 Nov 13, 2024
a90f155
Merging in changes from main
peytonr18 Nov 26, 2024
44e8737
Merge branch 'main' into probertson-kvp-test
peytonr18 Dec 4, 2024
c3dd388
Resolving lifetime declaration of StringVisitor clippy error, along w…
peytonr18 Dec 4, 2024
b96c33f
Add unit test to validate slice value length and replace hardcoded va…
peytonr18 Dec 6, 2024
352d4cd
Refactor module structure and rename tracing.rs to logging.rs in azur…
peytonr18 Dec 6, 2024
c87ac52
Merge main into probertson-kvp-test
peytonr18 Dec 12, 2024
f62d7b9
Merge branch 'main' into probertson-kvp-test
peytonr18 Dec 12, 2024
71cd9ff
Resolving cargofmt issue
peytonr18 Dec 12, 2024
35bd757
Merge branch 'main' into probertson-kvp-test
peytonr18 Dec 17, 2024
8b2c829
Auditing tracing::info! calls in an attempt to clean up KVP file for …
peytonr18 Dec 17, 2024
e14d8ac
Add filter
peytonr18 Dec 18, 2024
0536a3b
Refactor emit_kvp_kayer and otel_layer with improved per-layer filter…
peytonr18 Dec 20, 2024
6257a81
Updating note to reflect that both the key and the value are null-byt…
peytonr18 Jan 13, 2025
9e6820a
Refactor KVP logging to use module-based targets and update filtering…
peytonr18 Jan 30, 2025
88ae27d
Merge remote-tracking branch 'origin/main' into probertson-kvp-test
peytonr18 Jan 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,22 @@ build = "build.rs"
exitcode = "1.1.2"
anyhow = "1.0.81"
tokio = { version = "1", features = ["full"] }
tracing-subscriber = { version = "0.3.18", features = ["env-filter"] }
tracing = "0.1.40"
clap = { version = "4.5.21", features = ["derive", "cargo", "env"] }
sysinfo = "0.27"
tracing-subscriber = { version = "0.3.18", features = ["env-filter"] }
opentelemetry = "0.26"
opentelemetry_sdk = "0.26"
tracing-opentelemetry = "0.27"
uuid = { version = "1.2", features = ["v4"] }
chrono = "0.4"

[dev-dependencies]
assert_cmd = "2.0.16"
predicates = "3.1.2"
predicates-core = "1.0.8"
predicates-tree = "1.0.11"
tempfile = "3.3.0"

[dependencies.libazureinit]
path = "libazureinit"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,4 @@ Any use of third-party trademarks or logos are subject to those third-party's po

## libazureinit

For common library used by this reference implementation, please refer to [libazureinit](https://github.com/Azure/azure-init/tree/main/libazureinit/).
For common library used by this reference implementation, please refer to [libazureinit](https://github.com/Azure/azure-init/tree/main/libazureinit/).
69 changes: 69 additions & 0 deletions doc/libazurekvp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Tracing Logic Overview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear on reading the log how to disable writing to kvp? This is a strong requirement per Azure Privacy Review (that we need to document what is collected, and provide the customer with a mean to disable it)

This document probably focuses on the technical aspect of tracing/kvp. I would add another section on the main page about telemetry and describe what we're collecting
See this section
https://learn.microsoft.com/en-us/azure/virtual-machines/linux/using-cloud-init#telemetry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#152
Will address this great PR comment in this issue!


## How Tracing is Set Up

The tracing setup in this project is built around three key layers, each with its own responsibility:

1. **EmitKVPLayer**: Custom Layer for Span Processing
2. **OpenTelemetryLayer**: Context Propagation and Span Export
3. **stderr_layer**: Formatting and Logging to stderr

These layers work together, yet independently, to process span data as it flows through the program.

## Layer Overview

### 1. EmitKVPLayer

- **Purpose**: This custom layer is responsible for processing spans and events by capturing their metadata, generating key-value pairs (KVPs), encoding them into a specific format, and writing the encoded data to the VM's Hyper-V file for consumption by the `hv_kvp_daemon` service.

- **How It Works**:
- **Span Processing**: When a span is created, `EmitKVPLayer` processes the span's metadata, generating a unique key for the span and encoding the span data into a binary format that can be consumed by Hyper-V.
- **Event Processing**: When an event is emitted using the `event!` macro, the `on_event` method in `EmitKVPLayer` processes the event, capturing its message and linking it to the current span. Events are useful for tracking specific points in time within a span, such as errors, warnings, retries, or important state changes. Events are recorded independently of spans but they are be tied to the span they occur within by using the same span metadata.
- Both span and event data are written to the `/var/lib/hyperv/.kvp_pool_1` file, which is typically monitored by the Hyper-V `hv_kvp_daemon` service.
- The `hv_kvp_daemon` uses this file to exchange key-value pair (KVP) data between the virtual machine and the Hyper-V host. This mechanism is crucial for telemetry and data synchronization.

- **Reference**: For more details on how the Hyper-V Data Exchange Service works, refer to the official documentation here: [Hyper-V Data Exchange Service (KVP)](https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/integration-services#hyper-v-data-exchange-service-kvp).

### 2. OpenTelemetryLayer

- **Purpose**: This layer integrates with the OpenTelemetry framework to handle context propagation and export span data to an external tracing backend (e.g., Jaeger, Prometheus) or to stdout.
- **How It Works**:
- As spans are created and processed, the `OpenTelemetryLayer` ensures that context is propagated correctly across different parts of the program, which is crucial in distributed systems for tracing requests across service boundaries.
- The span data is then exported to a configured backend or stdout, where it can be visualized and analyzed using OpenTelemetry-compatible tools.

### 3. stderr_layer

- **Purpose**: This layer formats and logs span and event data to stderr or a specified log file, providing a human-readable output for immediate inspection.
- **How It Works**:
- Each span's lifecycle events, as well as individual emitted events, are logged in a structured format, making it easy to see the flow of execution in the console or log files.
- This layer is particularly useful for debugging and monitoring during development.

## How the Layers Work Together

- **Independent Processing**: Each of these layers processes spans and events independently. When a span is created, it triggers the `on_new_span` method in each layer, and when an event is emitted, it triggers the `on_event` method. As the span progresses through its lifecycle (`on_enter`, `on_close`), each layer performs its respective tasks.
- **Order of Execution**: The layers are executed in the order they are added in the `initialize_tracing` function. For instance, `EmitKVPLayer` might process a span before `OpenTelemetryLayer`, but this order only affects the sequence of operations, not the functionality or output of each layer.
- **No Cross-Layer Dependencies**: Each layer operates independently of the others. For example, the `EmitKVPLayer` encodes and logs span and event data without affecting how `OpenTelemetryLayer` exports span data to a backend. This modular design allows each layer to be modified, replaced, or removed without impacting the others.

In the `main.rs` file, the tracing logic is initialized. Spans are instrumented using the `#[instrument]` attribute and events can be created with the `event!` macro to monitor the execution of the function. Here's an example:

```rust
#[instrument(name = "root")]
async fn provision() -> Result<(), anyhow::Error> {
event!(Level::INFO, msg = "Starting the provision process...");
// Other logic...
}
```

1. **Initialization**:
The `initialize_tracing` function is called at the start of the program to set up the tracing subscriber with the configured layers (`EmitKVPLayer`, `OpenTelemetryLayer`, and `stderr_layer`).

2. **Instrumenting the `provision()` Function**:
The `#[instrument]` attribute is used to automatically create a span for the `provision()` function.
- The `name = "root"` part of the `#[instrument]` attribute specifies the name of the span.
- This span will trace the entire execution of the `provision()` function, capturing any relevant metadata (e.g., function parameters, return values).

3. **Span Processing**:
As the `provision()` function is called and spans are created, entered, exited, and closed, they are processed by the layers configured in `initialize_tracing`:
- **EmitKVPLayer** processes the span, generates key-value pairs, encodes them, and writes them directly to `/var/lib/hyperv/.kvp_pool_1`.
- **OpenTelemetryLayer** handles context propagation and exports span data to a tracing backend or stdout.
- **stderr_layer** logs span information to stderr or another specified output for immediate visibility.
Loading