|
| 1 | +--- |
| 2 | +page_type: sample |
| 3 | +languages: |
| 4 | +- bash |
| 5 | +- python |
| 6 | +products: |
| 7 | +- azure-machine-learning |
| 8 | +description: Sample setup script to scan Compute Instances for malware and security vulnerabilities |
| 9 | +--- |
| 10 | + |
| 11 | +# Compute Instance Security Scanner |
| 12 | + |
| 13 | +[](../../../LICENSE) |
| 14 | + |
| 15 | +A security scanner for Azure ML [Compute Instances](https://learn.microsoft.com/en-us/azure/machine-learning/concept-compute-instance) reporting malware and vulnerabilities in OS and Python packages to [Azure Log Analytics](https://learn.microsoft.com/en-us/azure/azure-monitor/logs/log-analytics-overview). For details on the vulnerability management process for the Azure Machine Learning service, see [Vulnerability Management](https://learn.microsoft.com/azure/machine-learning/concept-vulnerability-management). |
| 16 | + |
| 17 | +## Getting Started |
| 18 | + |
| 19 | +> Prerequisite: an Azure ML workspace with a Compute Instance running and [diagnostic logs](https://learn.microsoft.com/en-us/azure/machine-learning/monitor-azure-machine-learning) streaming to Log Analytics. See further down for alternative setups. |
| 20 | +
|
| 21 | +1. **Upload the scanner to Azure ML**: |
| 22 | + 1. Download [`amlsecscan.py`](amlsecscan.py) |
| 23 | + 2. Open [Azure ML Studio](https://ml.azure.com/) |
| 24 | + 3. Go to the Notebooks tab |
| 25 | + 4. Upload the file into your user folder `/Users/{user_name}` (replacing `{user_name}` with your user alias) |
| 26 | +2. **Install the scanner**: open a terminal in Azure ML Notebooks and run `sudo ./amlsecscan.py install` |
| 27 | +3. **Run a scan**: in the terminal, run `sudo ./amlsecscan.py scan all` (this takes a few minutes) |
| 28 | + |
| 29 | +## Assessments |
| 30 | + |
| 31 | +The security scanner installs [ClamAV](https://www.clamav.net/) to report malware and [Trivy](https://github.com/aquasecurity/trivy) to report OS and Python vulnerabilities. |
| 32 | + |
| 33 | +Security scans are scheduled via CRON jobs to run either daily around 5AM or 10 minutes after OS startup. A CRON job also emits heartbeats every 10 minutes. Scans have their CPU usage limited to 20% and are deprioritized by running at priority 19. |
| 34 | + |
| 35 | +Trivy is configured to report vulnerabilities of severity either `HIGH` or `CRITICAL` for which a fix is available. The ClamAV realtime scanning is not enabled. |
| 36 | + |
| 37 | +## Telemetry |
| 38 | + |
| 39 | +In Log Analytics, the scanner reports hearbeats to table `AmlSecurityComputeHealth_CL` and assessment results to `AmlSecurityComputeAssessments_CL`. |
| 40 | + |
| 41 | +Examples of Log Analytics [KQL](https://docs.microsoft.com/en-us/azure/data-explorer/kql-quick-reference) queries: |
| 42 | +- Recent heartbeats and scan status: `AmlSecurityComputeHealth_CL | top 100 by TimeGenerated desc` |
| 43 | +- Recent assessments: `AmlSecurityComputeAssessments_CL | top 100 by TimeGenerated desc` |
| 44 | + |
| 45 | +## Installation |
| 46 | + |
| 47 | +Irrespective of how the scanner is installed, the scanner script must first be copied to the Azure ML workspace: |
| 48 | +1. Download [amlsecscan.py](amlsecscan.py) |
| 49 | +2. Open [Azure ML Studio](https://ml.azure.com/) |
| 50 | +3. Go to the Notebooks tab |
| 51 | +4. Upload the file into your user folder `/Users/{user_name}` (replacing `{user_name}` with your user alias) |
| 52 | + |
| 53 | +The scanner can be installed on both existing and new Compute Instances. |
| 54 | + |
| 55 | +### Existing Compute Instances |
| 56 | + |
| 57 | +To install the scanner using default settings, run the `install` command without parameters: |
| 58 | +```bash |
| 59 | +sudo ./amlsecscan.py install |
| 60 | +``` |
| 61 | +The scanner will use the first Log Analytics workspace to which the Azure ML workspace streams [diagnostic logs](https://learn.microsoft.com/en-us/azure/machine-learning/monitor-azure-machine-learning). |
| 62 | + |
| 63 | +If diagnostic logs are not enabled or an alternative Log Analytics workspace needs to be used, its ARM Resource ID can be specified on the command line using the `-la` parameter: |
| 64 | +```bash |
| 65 | +sudo ./amlsecscan.py install -la /subscriptions/{subscription_id}/resourceGroups/{resource_group_name}/providers/Microsoft.OperationalInsights/workspaces/{workspace_name} |
| 66 | +``` |
| 67 | + |
| 68 | +Another option, in case this configuration is reused multiple times, is to store the ARM Resource ID of the Log Analytics workspace in a JSON configuration file called |
| 69 | +`amlsecscan.json` in the same folder as `amlsecscan.py`: |
| 70 | +```json |
| 71 | +{ |
| 72 | + "logAnalyticsResourceId": "/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}/providers/Microsoft.OperationalInsights/workspaces/{workspace_name}" |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +> The ARM Resource ID of a Log Analytics workspace can be obtained by opening the [Azure Portal](https://portal.azure.com), navigating to the Log Analytics workspace, and copying this substring from the browser URL. |
| 77 | +
|
| 78 | +### New Compute Instances |
| 79 | + |
| 80 | +#### Using Azure ML Studio |
| 81 | + |
| 82 | +1. Create a file called `amlsecscan.sh` with content `sudo python3 amlsecscan.py install` . |
| 83 | +2. Open the [Compute Instance list](https://ml.azure.com/compute/list) in [Azure ML Studio](https://ml.azure.com) |
| 84 | +3. Click on the `+ New` button |
| 85 | +4. In the pop-up, select the machine name and size then click `Next: Advanced Settings` |
| 86 | +5. Toggle `Provision with setup script`, select `Local file`, and pick `amlsecscan.sh` |
| 87 | +6. Click on the `Create` button |
| 88 | + |
| 89 | +#### Using an ARM Template |
| 90 | + |
| 91 | +For automated deployments, the scanner can be installed as part of the ARM templates deploying the Compute Instances. |
| 92 | + |
| 93 | +Start by creating an ARM template, say `deploy.json`, with a `setupScripts` section. In the example below replace `{azure_ml_workspace_name}`, `{azure_ml_compute_name}`, `{user_name}` with appropriate values. You may also want to adjust `location`/`VMSize`, add `schedules`, etc. |
| 94 | +```json |
| 95 | +{ |
| 96 | + "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", |
| 97 | + "contentVersion": "1.0.0.0", |
| 98 | + "resources": [ |
| 99 | + { |
| 100 | + "type": "Microsoft.MachineLearningServices/workspaces/computes", |
| 101 | + "name": "{azure_ml_workspace_name}/{azure_ml_compute_name}", |
| 102 | + "location": "westus2", |
| 103 | + "apiVersion": "2021-07-01", |
| 104 | + "properties": { |
| 105 | + "computeType": "ComputeInstance", |
| 106 | + "disableLocalAuth": true, |
| 107 | + "properties": { |
| 108 | + "VMSize": "Standard_F2s_v2", |
| 109 | + "applicationSharingPolicy": "Personal", |
| 110 | + "sshSettings": { "sshPublicAccess": "Disabled" }, |
| 111 | + "setupScripts": { |
| 112 | + "scripts": { |
| 113 | + "creationScript": { |
| 114 | + "scriptSource":"inline", |
| 115 | + "scriptData":"[base64('sudo python3 {user_name}/amlsecscan.py install')]", |
| 116 | + "timeout": "10m" |
| 117 | + } |
| 118 | + } |
| 119 | + } |
| 120 | + } |
| 121 | + } |
| 122 | + } |
| 123 | + ] |
| 124 | +} |
| 125 | +``` |
| 126 | + |
| 127 | +Deploy the ARM template using the [Az CLI](https://docs.microsoft.com/en-us/cli/azure/): |
| 128 | +```bash |
| 129 | +az login |
| 130 | +az account set --subscription {subscription_id} |
| 131 | +az deployment group create --resource-group {resource_group_name} --template-file deploy.json |
| 132 | +``` |
| 133 | + |
| 134 | +## Clean Up |
| 135 | + |
| 136 | +To stop scan scheduling and remove the scanner, run `sudo ./amlsecscan.py uninstall` . |
| 137 | + |
| 138 | +## Troubleshooting |
| 139 | + |
| 140 | +### Ensure that telemetry is emitted |
| 141 | + |
| 142 | +Check the scanner health in Log Analytics: `AmlSecurityComputeHealth_CL | top 100 by TimeGenerated desc` . |
| 143 | + |
| 144 | +It should show heartbeats with `Type_s == 'Heartbeat'` every 10 minutes. |
| 145 | + |
| 146 | +Scan status (`Type_s == 'ScanMalware'` or `Type_s == 'ScanVulnerabilities'`) should appear as pairs of log entries, |
| 147 | +one with `Status_s = 'Started'` followed by one with `Status_s = 'Succeeded'`. If `Status_s` is `Failed`, the `Details_s` field includes the error message. |
| 148 | + |
| 149 | +If heartbeats are not present in Log Analytics, verify whether heartbeats can be emitted by running `./amlsecscan.py heartbeat` in a terminal on the Compute Instance. |
| 150 | + |
| 151 | +### Ensure that the scanner is running |
| 152 | + |
| 153 | +If logs are missing in Log Analytics, scans may not be running. Local logs are available in `syslog` for investigation: |
| 154 | +- Check `cron` logs: `sudo cat /var/log/syslog | grep -i cron` |
| 155 | +- Check scanner logs: `sudo cat /var/log/syslog | grep -i amlsecscan` |
| 156 | + |
| 157 | +The CRON configuration is located at `/etc/cron.d/amlsecscan` . |
| 158 | + |
| 159 | +Scans can be run manually with higher verbosity to get more details: `sudo /home/azureuser/.amlsecscan/run.sh scan all -ll DEBUG` . |
| 160 | + |
| 161 | +### Investigate Compute Instance deployment failures |
| 162 | + |
| 163 | +Compute Instance creation logs are stored under `/Logs/{azure_ml_compute_name}/creation`. |
| 164 | +They can also be found by selecting the Compute Instance in Azure ML Studio, clicking on the `Logs` tab, and opening the file `Setup > stdout`. |
| 165 | + |
| 166 | +### Verify that malware gets reported |
| 167 | + |
| 168 | +Malware detection can be verified by downloading a simulated malware file: `wget -O ~/eicar.com.txt https://secure.eicar.org/eicar.com.txt` . |
| 169 | + |
| 170 | +The malware should be reported in Log Analytics: `AmlSecurityComputeAssessments_CL | where Type_s == 'Malware' | top 100 by TimeGenerated desc` . |
| 171 | + |
| 172 | +### Verify that the scanner files are present |
| 173 | + |
| 174 | +After installation, the following files should be present on the Compute Instance: |
| 175 | + |
| 176 | +File|Description |
| 177 | +--|-- |
| 178 | +`/home/azureuser/.amlsecscan/config.json`|Scanner configuration |
| 179 | +`/home/azureuser/.amlsecscan/run.sh`|Scanner CRON entry point |
| 180 | +`/etc/cron.d/amlsecscan`|Scanner CRON schedule |
| 181 | + |
| 182 | +### Verify that resource-usage limits are in place |
| 183 | + |
| 184 | +When running through CRON schedule, scans have their CPU usage limited to 20% and are deprioritized by running at priority 19. |
| 185 | +When running manually, CPU usage is not limited and priority is left as default. |
| 186 | + |
| 187 | +After a scan is run, its `cgroups` configuration limiting resource usage can be found under `/sys/fs/cgroup/cpu/amlsecscan` . |
0 commit comments