Skip to content

Latest commit

 

History

History
134 lines (87 loc) · 6.96 KB

04_Traffic-Mirroring.md

File metadata and controls

134 lines (87 loc) · 6.96 KB

Lab - Traffic Mirroring

1. Clean-up existing BookService deployment

  1. Using the PowerShell session alredy used for kubectl, remove the existing bookservice deployments and service by executing the following commands:

    kubectl delete deployment bookservice ; kubectl delete service bookservice
    

    that will return

    deployment.extensions "bookservice" deleted
    service "bookservice" deleted
    
  2. Double check the results of the delete operation by executing:

    kubectl get pod; kubectl get service
    

    that confirm the lack of references to the bookservice pods and service

    NAME                            READY   STATUS    RESTARTS   AGE
    bookinfo-spa-57bdd84f98-92r2q   2/2     Running   0          39m
    NAME           TYPE           CLUSTER-IP   EXTERNAL-IP      PORT(S)        AGE
    bookinfo-spa   LoadBalancer   10.0.24.21   137.117.168.206  80:31470/TCP   2d5h
    kubernetes     ClusterIP      10.0.0.1     <none>           443/TCP        3d2h
    

2. Create the mirrored service and deployment

  1. Execute the following commands:

    kubectl apply -f C:\Labs\k8sconfigurations\mirroring\bookservice-V2-mirroring.yaml ; kubectl get services; kubectl get deployments

    and you should get the following output:

    image.png

  2. At this point both mirrored deployment (bookservicemirror) and user facing deployment (bookservice) are configured with the same docker image (readymirroring/bookservice). Now you can browse the web application or invoke the poller.ps1 script used in the previous modules.

    image.png

  3. Wait a couple of minutes, needed for Azure Application Insights to collect telemetry, and paste the content of the "C:\Labs\Lab_Modules\k8sconfigurations\mirroring\LogAnalyticsQuery.md" file into Azure Log Analytics.

    requests
    | where customDimensions["VersionTag"] contains "MIR-"
    | summarize duration = avg(duration), requestCount = count() by name, podVersion = tostring(customDimensions   ["VersionTag"]), resultCode 
    | sort by name, podVersion
    

    Then hit "Run" query and you should get something similar to the following image:

    image.png
    Please expect few differences in number between your query results and the above image.

  4. We are done with our first traffic mirroring! You can see from the query results above that, as we expected, result codes and duration are very close between mirrored and user facing service.

    In order to achieve that, we tagged traffic coming from user facing service with the attribute "podVersion" = "V1MIR-LiveBookService" and the traffic coming from mirrored service with the attribute "podVersion" = "V2MIR-BookService".

How does it work?

The front end reverse proxy, Envoy, has a very useful configuration that allows to send traffic to a live cluster and a mirror cluster: the traffic is sent to the mirror cluster in a "fire and forget" mode, which means that Envoy doesn't wait for an answer from the mirror cluster.
You can find the mirror configuration in the file "C:\Labs\src\Sidecars\default-sidecar.yaml". Below an excerpt of file:

image.png

At line 25 and 26, we configure envoy so that every request to "bookservice" cluster must be mirrored to "bookservicemirror" cluster.
Then the configuration of two clusters is straightforward (note how the addresses correspond to the kubernetes services names):

image.png

3. Introduce a performance decrease in the mirrored service

  1. From the same folder, we're going to rollout a new version of our mirrored bookservice, which introduces a delay while loading book reviews by executing the following command:

    kubectl apply -f C:\Labs\k8sconfigurations\mirroring\bookservice-V3-delays.yaml
  2. At this point, let's browse again between book reviews from the web page or run the poller.ps1 as below:

    image.png

  3. Wait a couple of minutes, needed for Azure Application Insights to collect telemetry, and paste the content of the C:\Labs\Lab_Modules\k8sconfigurations\mirroring\LogAnalyticsQuery.md file into Azure Log Analytics.

    requests
    | where customDimensions["VersionTag"] contains "MIR-"
    | summarize duration = avg(duration), requestCount = count() by name, podVersion = tostring(customDimensions["VersionTag"]), resultCode
    | sort by name, podVersion
    

    Then hit "Run" query and you should get something similar to the following image:

    image.png Please expect few differences in number between your query results and the above image.

We have anticipated our first problem without impacting real users! You can see from Azure Log Analytics query results that the service with "V3MIR-BookServiceDelay" tag has an average duration of 1,208 milliseconds, while the "V1MIR-LiveBookService" (the version real users are seeing) still has an average requests duration of just 15 milliseconds meaning they are not impacted.

3. Introduce a fault in the mirrored service

  1. We're going to rollout a new version of our mirrored service, which introduces a fault while loading book reviews with BookId = 2 and BookId = 4 (the same fault we used for previous modules). Type following command:

    kubectl apply -f C:\Labs\k8sconfigurations\mirroring\bookservice-V4-fault.yaml
  2. At this point, let's browse again between book reviews from the web page or run the poller.ps1 as below:

    image.png

  3. Wait a couple of minutes, needed for Azure Application Insights to collect telemetry, and paste the content of the "C:\Labs\k8sconfigurations\mirroring\LogAnalyticsQuery.md" file into Azure Log Analytics.

    requests
    | where customDimensions["VersionTag"] contains "MIR-"
    | summarize duration = avg(duration), requestCount = count() by name, podVersion = tostring(customDimensions["VersionTag"]), resultCode
    | sort by name, podVersion
    

    Then hit "Run" query and you should get something similar to the following image:

    image.png Please expect few differences in number between your query results and the above image.

We have prevented real users from experiencing a failure! You can see from log analytics query results that the service with "V4MIR-BookServiceFault" has several requests with a 500 status code, indicating a failure. Meanwhile, the user facing version of the service, "V1MIR-LiveBookService", runs smoothly without incurring in any failure.