Data Policies
Data policies manage what users see when they query data in a table they have access to.
There are three different ways to restrict data with data policies:
- Row-level: Filter rows from certain users at query time.
- Column masking: Mask values in a column at query time.
- Cell masking: Mask specific cells in a column based on separate values in the same row at query time.
When applying a data policy, it will always be enforced for all users, following the principle of least privilege, unless optional exceptions are added to policies. Data policy exceptions are built using any of the following conditions, which can be mixed with boolean logic:
- If the user is a member of a group (or several groups)
- If the user possesses a particular attribute (or several attributes)
- If the user is acting under a purpose (or several purposes) for which the data is allowed to be used
Data policy exceptions are very similar to subscription policies from this perspective. With subscription policies, nobody has access to a newly created table until someone says otherwise with a subscription policy (as long as you follow best practices for newly created tables and views). Similarly, when a masking policy is set on a column or a row-level policy on a table, it applies to everyone until someone says otherwise with an exception to the data policy.
Leveraging lookup tables
If user metadata is stored in a table in the same data platform where a policy is enforced, it is not necessary to move that user metadata in Immuta. Instead it can be referenced directly using custom WHERE functions in data policies.
Below is an example row-level policy that leverages a lookup table to dynamically drive access to rows in a table:
CREDIT_CARD_NUMBER | TRANSACTION_LOCATION | TRANSACTION_TIME | ACCESS_LEVEL |
---|---|---|---|
0123456789 | Lewes, DE | 00:07:34 | 4 |
9876543210 | College Park, MD | 09:16:08 | 8 |
The final column in the table, ACCESS_LEVEL, defines who can see that row of data.
Now consider the following hierarchy:
In this diagram, there are 11 different access levels (AL) to data and the tree defines access. For example, if a user has Vegetables
, they get access levels 2, 3, 4, 9, 10, and 11. If a user has Pear
, they only get access level 8. In other words, a user with Vegetables
would see the first row of the above table, a user with Pear
would see the second row of the above table, and a user with Food
would see both rows of the table.
Taking the example further, that hierarchy tree is represented as a table in the data platform that we wish to use to drive the row-level policy:
ACCESS_LEVEL | ROOT |
---|---|
1 | Food |
2 | Food |
3 | Food |
4 | Food |
5 | Food |
6 | Food |
7 | Food |
8 | Food |
9 | Food |
10 | Food |
11 | Food |
2 | Vegetables |
3 | Vegetables |
4 | Vegetables |
9 | Vegetables |
10 | Vegetables |
11 | Vegetables |
5 | Fruits |
6 | Fruits |
7 | Fruits |
8 | Fruits |
4 | Carrots |
9 | Leafy |
10 | Leafy |
11 | Leafy |
5 | Orange |
6 | Orange |
7 | Orange |
8 | Pear |
10 | Lettuce |
11 | Lettuce |
That hierarchy lookup table can be referenced in the row-level policy as user metadata like this:
@columnTagged('access_level')
IN (SELECT ACCESS_LEVEL from [lookup table] where@attributeValuesContains('user_level', 'ROOT')
)
Walking through the policy step-by-step:
-
@columnTagged('access_level')
: This allows us to target multiple tables with an ACCESS_LEVEL column that needs protecting with a single policy. Simply tag all the ACCESS_LEVEL columns with theaccess_level
tag and this policy would apply to all of them. -
IN (SELECT ACCESS_LEVEL from [lookup table]
: This is selecting the matching ACCESS_LEVEL from the lookup table to use as the IN clause for filtering the actual business table. -
where @attributeValuesContains('user_level', 'ROOT')
: This is comparing the user's attributeuser_level
to the value in the ROOT column, and if there's a match, that ACCESS_LEVEL is used for filtering in the previous step. See the custom WHERE documentation for more details on these functions.
So, you can then add metadata to your users in Immuta, such as Vegetables
or Pear
and that will result in them seeing the appropriate rows in the business table in question.
The above example used a row-level policy, but it could instead do cell masking using the same technique:
Mask columns tagged
Credit Card Number
using hashing where@columnTagged('access_level')
NOT IN (SELECT ACCESS_LEVEL from [lookup table] where@attributeValuesContains('user_level', 'ROOT')
)
In this case, the credit card number will be masked if the access_level is not found for the user for that row.
Even if not using a lookup table, the power of the @columnTagged('tag name')
function is apparent for applying your masking or row-level policies at scale.
Other topics of interest
-
There are several ways to mask data, and choosing the correct masking type has various tradeoffs. It's important to understand those tradeoffs when choosing masking types.
-
How data policy conflicts and intelligent fallbacks are managed