Onboarding to Scenario Health
This guide will outline step by step process to onboard your Scenarios to Scenario Health.
Setup Scenario review with Scenario Health team
Before onboarding a Scenario, Scenario Health team asks every team to set up a Scenario review discussion to help:
- Discuss what is the Scenario being measured (define Scenarios from the customer perspective measuring customer intent). As subject matter experts for your business, you can help identify what is important for your business which Scenarios are critical for success.
- Where does the Scenario start/where does it end?
- Review system diagram of the flow or review UX experience and complete a UI walkthrough.
- Review telemetry to ensure the right instrumentation for Scenario.
Once this is completed, Scenario owners can submit the onboarding through Scenario management portal (Scenario Health team will provide the Scenario key to complete the onboarding).
Resources available for onboarding discussion:
- Scenario Health overview – Familiarize yourself with supported Scenario types, measurement dimensions and instrumentation ideologies.
- Scenario Onboarding PPT – You can collect data for your Scenario in this PPT for Scenario discussion (optional).
- Granting Scenario Health Permissions to Consume Metrics - How to grant Scenario Health permissions to consume your metrics. Once the Scenario review is completed, you can continue onboarding using Scenario Management portal.
Scenario Management portal
New user experience that is in line with existing Scenario Health portal enables Scenarios owners to manage onboarding or changes to existing Scenarios.
- The Scenario Management page has a familiar interface where Scenario count is shown for filtered org and divided by the Scenario Group. The Scenario Management panel shows a list of Scenarios based on selected filter (Org, Executive, Owner, Scenario Group etc).
- Filter by keyword works with Scenario Title, Scenario Group and Scenario Key.
- Clicking on any scenario group should filter scenarios down to that group.
- Scenario Management section has 3 tabs. Manage, Alerts, Pending approvals.
- Manage – Default tab when Scenario Management page is loaded and list all Scenarios based on the filters.
- Alerts – Will display all the Scenarios where there is an active alert.
- Pending Approvals – List all the Scenarios that are waiting on Scenario Health team approval (list would contain new onboarding, changes or repull requests).
Scenarios will be displayed in a list as read only with edit icon under Action column.
Scenario owners can expand the Scenario to view Scenario meta data and Query details.
Note
Scenario Health supports 3 cosmos Scenarios that are non-editable and can’t be managed with the new experience. Please reach out to SHSupport@microsoft.com for any changes that are needed in those Scenarios.
To add a new Scenario, click on Add Scenario.
Add Scenario
Scenario owners can now request a new Scenario onboarding using “Add Scenario” form which helps with query validation as well.
After clicking on “Add Scenario”, the following meta data would be needed to onboard a new Scenario:
- Scenario Group – Select the option that best aligns with business or customer intent that is relevant to the Scenario (see FAQ).
- Scenario Name – Provide one line description for the Scenario that provide meaning to what is being measured (for instance, you can add API at the end of the name if requested Scenario is for API)
- Scenario Key – Is the unique identifier for Scenario (usually it’s a 3-part key – commerce.<Scenario group>.<word description for the Scenario>) (see FAQ).
- Owner – Usually the engineering owner for the Scenario
- Segment Owner – List of aliases to receive reliability dip notifications or teams to reach out to in case of questions for requested Scenario.
- Target – Most teams start with 99.9 as the target. If you are requesting a lower target value, it will be reviewed in the Scenario Health team sync
Active flag – The scenario needs to be marked as active for data pull to happen. If not checked, no data will be pulled in Scenario Health.
Every new Scenario should provide the query details for one of the supported data sources in Scenario Health. The following is supported data sources and how to grant permission to each one of them:
Kusto Clusters:
If the subscription is in PME tenant, for Scenario Health to pull data from Kusto database, you will need to grant permission to ScenarioHealthDataPullReader Managed Identity.
Adding User Managed Identity Permissions for Kusto:
- The following section contains the steps for adding the Scenario Health managed identity to a Kusto database.
- Open the azure portal and open the Azure Data Explore Cluster for Kusto database.
- Click on the database and select the “Permissions” section.
Click on “Add” and select “Viewer”.
- Search for ScenarioHealthProdDataPullReader and select the identity.
- Click refresh and the identity should appear.
If the subscription is not deployed in the PME tenant subscription, for Scenario Health to pull data from Kusto database, you will need to grant permission to SHDataCollector Service Principle.
// Example command .add database <DBName> viewers ('aadapp=SHDataCollector;72f988bf-86f1-41af-91ab-2d7cd011db47')
Please follow the documentation here to learn more on how to grant permission for Kusto cluster. You can also grant permission from Azure portal as well. You will need to search for display name as “SHDataCollector” to grant permission to this app.
Application Insights Classic (Depracated) – Scenario Health would need reader permission to be able to pull data from App Insights. Please follow the steps outlines below to grant the permission:
- Browse Access Control on your App Insights resource and click on Add role assignment
- Select reader role and add “SHDataCollector” service principle as the member to grant permission to
- Click review + assign button to complete the grant operation
Application Insights – Scenario Health would need reader permission to be able to pull data from App Insights. Please follow the steps outlines below to grant the permission:
- For Corp Subscription add the Scenario Health Service Principle:
- Browse Access Control on your App Insights resource and click on Add role assignment
- Select reader role and add “SHDataCollector” as the member to grant permission to
- Click review + assign button to complete the grant operation
- For PME Subscription add the Scenario Health Managed Identity:
- Browse Access Control on your App Insights resource and click on Add role assignment
- Select reader role and add “ScenarioHealthProdDataPullReader” as the member to grant permission to
- Click review + assign button to complete the grant operation
- For Corp Subscription add the Scenario Health Service Principle:
Permission to Azure SQL – Please run the following commands to grant permission to Scenario Health app to get reader role on your database:
For Corp Subscription add the SHDataCollector service principle:
-- Here are the SQL statements that need to run on SQL database which stores the data CREATE USER SHDataCollector FROM EXTERNAL PROVIDER; EXEC sp_addrolemember 'db_datareader', 'SHDataCollector';
For PME Subscription add the ScenarioHealthProdDataPullReader Managed Identity:
-- Here are the SQL statements that need to run on SQL database which stores the data CREATE USER ScenarioHealthProdDataPullReader FROM EXTERNAL PROVIDER; EXEC sp_addrolemember 'db_datareader', 'ScenarioHealthProdDataPullReader';
Permission to Geneve Metrics – Please follow the step-by-step guide to grant permission to Geneva metrics account:
- Browse Metrics Account setting under Account and select your Geneva account
- Under Machine Access option, browse Keyvault managed certificates and add Scenario Health Keyvault cert “ava.prod.geneva.keyvault.ava.prod” and save the configuration update.
Permission to Geneve Logs – Please follow the step-by-step guide to grant permission to Geneva logs account:
- Browse Logs Account setting under Account and select your Geneva Logs Account and go to User Roles in the left menu.
- Locate the user role that has reader permission only (Access mode – readonly). If one doesn’t exist, please create one.
- Under Managed Certificates option, browse Key Vault Managed Certificates and provide “ava.prod.geneva.keyvault.ava.prod” as SAN and click Save to update the configuration update.
Authoring query
For Kusto, AppInsights and Azure SQL data source, there are certain requirements for providing the query:
- Query should include {BookMarkDate} and {EndDateTime} placeholders (see FAQ). They represent the time window for which the data pull is run in a single instance.
- {BookMarkDate} – During data pull, this variable will be replaced with last successful data pull timestamp.
- {EndDateTime} – During data pull, this variable will be replaced with timestamp for which we are pulling the data in that run.
- Query output should include DataDateTime(Timestamp), Total, Success, Failure and Latency (Latency is optional).
- Query should return data in UTC and output should be aggregated by 1 Hr time window.
- Pull Data from is the date from which data is pulled in Scenario Health.
Kusto & Application Insights:
// Example query for Kusto & AppInsights TableName | where Timestamp >= datetime({BookMarkDate}) // Pull start date time | where Timestamp < datetime({EndDateTime}) // Pull end date time | <filters and query logic> | summarize Total=sum(TotalTransactions), Success=sum(SuccessTransactions), Failure=sum(FailureTransactions), Latency = avg(LatencyTransactions) by DataDateTime=bin(Timestamp,1h)
Azure SQL:
-- Example query for AzureSQL (using Store procedure) Exec <dbo.ReliabilityStoreProcedureHourly> ‘{BookMarkDate}’ ‘{EndDateTime}’ – Assuming variables in the order -- Example query for AzureSQL (using table select) SELECT Convert(Datetimeoffset,FORMAT(Timestamp, 'yyyy-MM-dd HH:00')) as DataDateTime, sum(TotalTransactions) as Total, sum(SuccessTransactions) as Success, sum(FailureTransactions) as Failure FROM TableName WHERE Timestamp >= ‘{BookMarkDate}’ and Timestamp <’{EndDateTime}’ GROUP BY Convert(Datetimeoffset,FORMAT(Timestamp, 'yyyy-MM-dd HH:00'))
Geneva Metrics:
// Example query for Geneva Metrics metric("MetricName").dimensions("requestStatus").samplingTypes("Count" as TotalCount1h).resolution(1h) | summarize Total = sum(TotalCount1h) | join( metric("MetricName").dimensions("requestStatus").samplingTypes("Count" as SuccessCount1h).resolution(1h) | where RequestStatus == “Success” | summarize Success = sum(SuccessCount1h) ) | project Total=replacenulls(Total,0), Success=replacenulls(Success,0), Failure=replacenulls(Total,0)-replacenulls(Success,0)
Note
“replacenulls” would be needed to convert null values to zero while pulling data in Scenario Health.
Geneva Logs:
// Example server query for Geneva Logs source | <filters and query logic> | summarize Total=sum(TotalTransactions), Success=sum(SuccessTransactions), Failure=sum(FailureTransactions) by DataDateTime=bin(Timestamp,1h) //Example client query source
- Query should include {BookMarkDate} and {EndDateTime} placeholders (see FAQ). They represent the time window for which the data pull is run in a single instance.
Validating query
Validation function in Scenario Health performs series of checks to ensure that provided meta data is correct and Scenario Health api has access to required backend to pull data. Following is the list of checks that are performed during validation:
- Scenario meta data validation – Owner and Segment Owner alias should resolve, Provided Target and Next 9 Target are correct (If Next 9 target exists it should be same of higher than target), none of the mandatory fields are left blank.
- Scenario Health data processor is able to connect to the customer data source and also able to authenticate. Failure to do so will be sent as a response in the result output for review.
- Provided query is correct in syntax (has the {BookMarkDate} and {EndDateTime} placeholders) and getting executed successfully without any exception.
- Query output has DataDateTime, Total, Success and Failure columns (Latency is an optional output column).
After successful validation of the query and the meta data, customers will be able to save their request to onboard a new Scenario.
Note
While adding Scenario, if you didn’t select “Executive” flag, make sure to flip the “Executive only” toggle to discover your newly added Scenario.
Notification Preference
Reliability Dip Notification:
- Notifications will be sent when the system detects the daliy reliability falls below the defined target.
- There are 2 notification types, email or Icm.
- If you select email, we'll send email notifications to your segment owners.
- If you select Icm, please select the severity you want and pick the right icm team public id which is associated with your stid. (If you are not sure about the icm team public id, you can find it here, see Details->Public ID).
Pending approvals
After successfully submitting a new Scenario or change in existing Scenario, request will be queued for Scenario Health team to review and act. This capability gives the Scenario Health team an opportunity to provide guidance and enforce a standard telemetry construct for all Scenarios.
This capability is designed, and access controlled for Scenario Health team only.
This allows the Scenario Health team to quickly review changes requested by Scenario owners. Based on the nature of the request, Scenario Health team may reach out to Scenario owners for more information.
Edit Scenario/ Data repull requests
Like adding scenario, any edits, or data repull ask used to be done manually via email to Scenario Health team. With this release, Scenario owners can request edits and data repull by going to Scenario Management experience.
Edit Scenario: As part of editing experience, Scenario owners can make changes to Scenario meta data, change data source, or even change Pull Data from. All business rules and validation that are applied for add Scenario are also applicable to Edit Scenario.
Note
While editing Scenario, if you didn’t select “Executive” flag, make sure to flip the “Executive only” toggle to discover your newly added Scenario.
Data re-pull: If there is change in Pull Data from date, request will be considered as repull request and data will be repulled from requested date.
Notifications/ Data pull failures / Alerts tab
Scenario Add-Edit notification – Scenario add or edit notification is being sent to Segment owners whenever there is a request being approved or denied for adding/editing a Scenario or for data repull. Scenario owners can reach out to SHSupport@microsoft.com in case of any questions.
Scenario Data-Pull notification – Data pull notification will be sent to Segment owners when there are issues in pulling the latest data from scenario data source. These issues could be due to changes in data source, intermittent query failures or underlining infra issue. More information on the exact failure is provided in the notification. Scenario owners can use the failure information to troubleshoot and fix the issue. In the event when failure is due to Scenario Health data puller, please reach out to SHSupport@microsoft.com for assistance. List of such Scenarios that are failing to pull the latest data will be displayed under Alerts tab. Scenario owners can also review this list on Scenario Management page under Alerts.
Alerts – This tab will display all the Scenarios that are failing to pull the latest data. Filters that are applied to the page will also filter list of Scenarios that are shown in the Alerts tab.
Note
Currently all the notifications are only being sent by email. The Scenario Health team is working on adding support for IcM to enable DRIs to review Scenario data pull alerts.
Validate your Scenario and request to enable Executive flag
The last step in the onboarding process would be to review and validate data, day over day reliability numbers, Scenario meta data. Once this is completed, you should request to enable “Executive” flag. Now your Scenario is part of regular tracking under live site and fundamentals discussion
Frequently asked questions
What is Scenario Group and how to select a value?
Scenario group is a logical group to help view all Scenarios that are related to business operations. Currently we are supporting following list – “Ingestion”, “Buy”, “Invoice”, “Pay”, “Manage”, “Financial Report”, “Payout” and “Ecosystem”. While requesting a new Scenario, pick the one that closely aligns with your business. Its ok if you pick the wrong one. You can edit the Scenario group after creation. Please reach out to shsupport@microsoft.com in case you would like to add another business function as part of Scenario group selection.What is Scenario Key and how to get one?
Scenario Key is a unique identifier and is used to uniquely identify a Scenario. It’s a 3-part key – “commerce.<Scenario group>.<word description for the Scenario>”. After the Scenario onboarding review, the Scenario Health team will provide you with the Scenario Key that you can use during Scenario onboarding through management portal.What to expect after submitting the onboarding request (or data repull requests)
After submitting the onboarding request, Scenario Health team is going to review the Scenario meta data and query details and will reach out to Scenario owners for questions. Once the onboarding request is approved, Scenario Health data puller will start pulling the data and Scenario will be available on Scenario Health portal. While submitting the new onboarding request, please select the Pull Data from based on when you want the data pull to start from. You can always request a data repull to load historical data.Why am I getting a error "Response status code does not indicate success: 401"?
If you receive this error it means that your data source does not have or has lost the required identity permissions called out in the Senario Health Onboarding InstructionsWhen Validating my query why am I getting an error?
If you receiving an error when validating your query it is possible that you have a syntax error like incorrectly formated text. It is recommended that you copy the query direclty from a KQL editor to validate syntax and avoid this.How to use {BookMarkDate} and {EndDateTime} variables?
After Scenario onboarding, Scenario Health service will pull data from your data source every hour. Internally Scenario Health backend maintains a timestamp which represents the datetime until it populated the data and for the next run, data is pulled for an hour after that. {BookMarkDate} and {EndDateTime} variables are replaced with backend timestamp during data pull runtime. The following are two edge cases for data pull:- New Scenario – When you onboard a new Scenario, there is no timestamp value that can be used to replace {BookMarkDate}. In that event, service would use “Pull Data from” timestamp to start pulling data until it catches up to current date time. In this process, pull service would pull data based on a daily time window (instead of hourly time window).
- Late arriving data – Certain data platforms can have data that may arrive late. To accommodate this, while pulling the data, we go back 6 Hours as start time. This will make sure that any data that arrives in 6 hours of window with older timestamp is refreshed in Scenario Health.
Note
You need to provide data with hourly granularity and your source data is in UTC. Scenario Health does the conversation of UTC to PST while calculating daily reliability.
How do I see the progress of my data pull/repull?
Currently Scenario Health doesn’t support providing a detailed status of data pull progress. Once the data pull is complete, Scenario Health portal home page would provide the reliability data for Scenario owners. Portal may show incomplete data until the pull is complete. Scenario Health team is working on adding support for providing more detailed status on data pull progress. If data pull fails, Scenario owners will be notified via email.Can I suppress alerts?
Scenario Health portal doesn’t support suppressing alerts. When an alert is generated for a Scenario, that usually indicates an issue with Scenario data source. Expectation from Scenario owners is to review and rectify such failures. Please reach out to SHSupport@microsoft.com if you need assistance from Scenario Health team.I changed my Kusto function or changed my log data. How do I refresh reliability for previous weeks or days?
When Scenario owners change the underline data source or query, only the new data that is being pulled will reflect those changes. To update the historical data, Scenario owners will need to submit a data repull requests by going to edit Scenario and updating Pull Data From value. After the request, Scenario Health team may review the changes with Scenario owners before approving the data repull requests.How do I reach out to Scenario Health team for issues or feedback?
For any questions or assistance, please contact SHSupport@microsoft.com