I am currently preparing to take the Microsoft Dynamics 365 + Power Platform Solution Architect certification. Or “MB 600” for short! As I prepare I plan to write revision notes in this post I will cover topics around designing the data and security models.
Data is a big topic! I therefore plan to split this information into two post. In this first post I will cover an overview of security and data design.
Note: Microsoft have recently updated some of their terminology. Therefore we should consider the terms like CDS and Dataverse as interchangeable. In terms of the exam I would expect them to begin using the term DataVerse at some point but that change is unlikely to be immediate. Meaning in the short term either term could be used. I created this blog post before the revised terms were announced. I have tried to apply updates but a few “outdated” references may still exist!
As a Solution Architect we need to consider if there are any security, regulatory or compliance requirements that will impact the solution design. Unfortunately, I have often seen security handled as an afterthought but the Solution Architect should consider security throughout the entire application lifecycle. From design and implementation to deployment and even ongoing into operations.
A discovery phase should review / document existing security measures. This is because a single project is unlikely to change the entire approach to authentication. You should understanding if single sign is already in place. Or fi the customer is using 3rd part authentication products or “just” Azure active directory. And if multi-factor authentication is in place.
It is also important to consider how the organizations structure may influence security models. To give just one example, I once worked with a large insurance company. For them it was critical that the data held by insurance brokers was kept isolated from records held by the insurance underwriting teams. These types of organizational structural requirements could lead the Architect to conclude multiple business units, environments or even tenants are required.
The Architect may need to review / design multiple layers of security. Including;
- Azure AD conditional access – blocking / granting system access based on user groups, devices or location.
- Environment roles – include user and admin roles (such as global admin). You can read about 365 admin roles here,
- Resource permissions for apps, flows, custom connectors etc.
- Dataverse (aka CDS) security roles – control access to entities and features within the Dataverse environment.
- Restrictions to the 300+ Dataverse connectors – data loss prevention polices (DLP) used to enforce how connectors are used.
You can read about restricting access here.
Within the Power Platform admin center we can optionally associate a Dataverse environment with a security group.
Whenever a license is assigned to a user, by default a user record would be created in all enabled Power Platform environments within the tenant. This could effectively grant them access to all environments within the tenant.
Security groups can be used to limit access to environments. Then only users who are added to the associated security group will be added to the environment as a user. If a user is ever removed from the security group they are automatically disabled.
If a security group is associated with an existing environment all users in the environment that are not members of the group will therefore be disabled.
Whilst security is massively important when presented with a requirement to restrict access to data it is worth questioning if this is really a security requirement. Or is it just filtering of data for convivence? By doing this you should create a solution which is secure but does not include any unnecessary boundaries.
Additionally you may need to consider any specific requirements around data storage. How long must data to retained? Does it need to reside in a specific country? Are there any laws that apply to how the data can be stored / used?
Sometimes regulatory requirements may also impose specific service level agreements or turnaround times. Or dictate that specific parties / governing bodies must be kept informed or involved in certain transactions.
Data should always be considered as a valuable asset. Therefore designing a security model to ensure the proper usage and access to that valuable asset is paramount. Features like Azure Conditional Access and Data Loss Prevention Policies can be enabled. Additionally ensuring proper usage of secrets / certificates for the services that access the data maybe essential.
Below you can see a diagram which highlights the layers of security which are available. When working with Dynamics 365 maybe an Architect tends to have a focus on the security roles found in the Dataverse. But it should also be noted that beyond these roles additional layers of security exist. For example, to give condition access to Azure or restrict access to connectors.
There are numerous standards for data compliance covering industry certifications, data protection and physical security. During the requirement gathering phases the Solution Architect should question which standards are applicable and may need to confirm compliance. Some examples include ISO27001, GDPR etc. Microsoft publish details of various standards and their compliance here.
Design entities and fields
The Solution Architect should lead the data model design. With model-driven apps …. It is not uncommon for my design work to actually start with what data is to be stored and build out from there. Therefore establishing a high-level data architecture for the project can be an essential early task.
It maybe common for the Solution Architect to design the data model at a high-level before other individuals extend their design. For example, the architect might define the core entities and their relationships but maybe the fields within those entities will be defined later by the design team. If this is the case the Architect would still need to review these detailed designs and provide feedback as the detailed data model evolves.
The Dataverse includes the Common Data Model! This is a set of system tables which support many common business scenarios. The Common Data Model (CDM) is open-sourced in GitHub and contains over 260 entities. Many systems and platforms implement the CDM today. These include Dataverse, Azure Data Lake, Power BI dataflows, Azure Data services, Informatica and more. You can find the CDM schema in GitHub here.
In addition to the industry standard CDM schema, Microsoft provide industry specific accelerators aimed at particular vertical markets. Examples include, Healthcare, Non-profit, Education, Finance and Retails. ISVs may then create industry specific apps which leverage these accelerators. You can find out about the accelerators here.
Whenever possible the standard system tables within the Common Data Model should be used. For example, if you need to record details about customers use the account table for that. This will not only make the system quicker and easier to develop but will aid future maintainability. All Architects should avoid re-creating the wheel!
It will be common to create diagrams to illustrate the data model, often called Entity Relationship Diagrams (ERDs). These show the entities (aka tables) and how they relate to other tables.
In my ERDs I like to highlight which entities are out of the box with no change, which are leveraging out of the box tables but with additional custom fields and which are completely custom.
Typically we will be thinking about tables and columns held within the Dataverse (CDS), conceptually it might be easy to think of Dataverse as a traditional database. But the Dataverse is much more than that! As it includes a configurable security model, can support custom business logic that will execute regardless of the application and even stores data differently depending it type. As relational data, audit logs and files (such as email attachments or photos) are all stored differently.
Sometimes, rather than depicting the tables in an ERD the Solution Architect may first create diagrams to illustrate the flow of data within the solution. Without needing to “worry” about the physical implementation. These diagrams are known as logical data models. Only once the logical data model is understood would the Architect then create a physical data model based on the logical model. The physical data model could take the form of an ERD and could include data within Dataverse, Azure Data Lake, connectors or other data stores.
There are several strategies / techniques that can be used to help the Architect when creating a data model;
Always start by depicting the core tables and relationships – having a focus on the core tables will avoid getting side-tracked into smaller (less important) parts of the solution.
Avoid over normalization – people with a data architecting background may tend to try and build a Dataverse data model with traditional SQL database concepts in mind. A fully normalised database within the Dataverse may have an adverse impact on the user experience!
Start with the end in mind – it is often useful to start off by defining the final reporting needs, you can then confirm the data model adequately meets those requirements.
Consider what data is required for AI – if you intend on implementing an AI element into your solution design consider what source data will be needed to support any machine learning / AI algorithms.
Plan for the future – consider the data model requirements for today but plan for how this might evolve into the future. (But avoid trying to nail every future requirement!)
Use a POC – creating a proof of concept to demonstrate how the data might be represented to users can be a valuable exercise. But be prepared that this might mean trying a data model and then throwing it away and starting again.
Don’t build what you don’t need – avoid building put parts of the data model you don’t plan to use. It is simple to add columns and tables later, so add them when you know they are required.
Once the tables within your solution design have been considered you will need to consider the detailed design task of creating columns (aka fields). Columns can be of many different data types, some of which have specific considerations. I will mention just a few here;
- Two options (yes/no) – when you create these ensure you will never need more choices! If unsure maybe opt for a “Choices” column.
- File and image – allows the storing of files and images into the Dataverse.
- Customer – a special lookup type that can be either a contact or account.
- Lookup / Choices (Optionsets) – which is best for your design! Optionsets (now known as Choices) make a simple user experience but lookups to reference data can give more flexibility to add options later.
- Date / Time – be careful to select the appropriate behaviour. (local, time zone independent, data only)
- Number fields – we have many variations to select from. Choose wisely!
Other options exist for storing files / images. Having images and documents in the Dataverse might be useful as security permissions would apply and the user experience to complete upload can be “pleasing”. But size limits do apply so storing large files might not be possible. Other options like SharePoint, which is ideal for collaboration exist. Or you could consider storing the files in Azure storage which might be useful for external access or archiving purposes. As part of your revision …. you may need to be aware of the pros / cons of various methods to store files!
Design reference and configuration data
When we create an table in the Power Platform there are some key decisions to make about how the table is created. Some of these cannot be easily changed later! For example, should the rows be user / team owned or organization owned.
User / team owned records have an owner field on every row in the table. This in turn can be used to decide what level of security is applied for the owner and other users. One good out of the box example of a user owned table might be the “contact” table. Each contact in the system is owned by a single user or team. It might then be that only the owner can edit the contact but maybe everyone can see the contact.
Alternatively tables can be organisation owned. With these you either get access or not! The record within the table are often available to everyone in the organization. These tables are ideal for holding reference / configuration data.
Often one consideration when designing reference data is to consider if a Choice (optionset) or a lookup to an organization owned table is the best approach. I find “choices” most useful for short lists which rarely change. Whilst lookups are ideal for longer lists that might evolve overtime. (As users can be granted permissions to maintain the options available in a lookup. But as the Choice column forms part of your solution the development team would need to alter the items in a Choice column.