Pages

Tuesday, June 26, 2018

Einstein Analytics: computeRelative getting started

computeExpression is a well-known transformation tool in Dataflow, you expect to get new fields as result, based on data in the same row. While computeRelative is also transformation tool in Dataflow, but you can compare with data in the same or the other row (first, last, next, and previous).

In computeRelative, you need to specify:
- input source
- partition by field: this is unique key field such as Opportunity Id, Account Id, etc.
- sort field: this is to tell data order
- computed fields: this would be the result fields from computeRelative.

Let's use this sample dataset:


Create a dataflow using edgemart to get existing dataset in Einstein Analytics:
- add computeRelative to get Previous Amount and Previous Closed Date
- using Opportunity Id as partition field
- order by Snapshot_Date


Here is the field attribute:


In this case, we would like to leave the Previous Amount and Previous Close Date as blank if there is no previous data for the particular records. If we leave the Default Value in Field Attribute as blank, there will be no default value defined in the JSON and this will cause errors when running the dataflow.

To fix that, download the JSON file and edit it manually, then upload it back to the dataflow.

  "computeRel1": {
    "action": "computeRelative",
    "parameters": {
      "source": "EdgeProspect",
      "computedFields": [
        {
          "name": "PreviousAmount",
          "label": "PreviousAmount",
          "expression": {
            "sourceField": "Amount",
            "offset": "previous()",
            "default": null
          }
        },
        {
          "name": "PreviousCloseDate",
          "label": "PreviousCloseDate",
          "expression": {
            "sourceField": "Closed_Date",
            "offset": "previous()",
            "default": null
          }
        }
      ],
      "orderBy": [
        {
          "name": "Snapshot_Date",
          "direction": "asc"
        }
      ],
      "partitionBy": [
        "Opportunity_ID"
      ]
    }
  }


Here is the final result:




Reference: computeRelative Parameters



Monday, June 25, 2018

Einstein Analytics: Binding syntax

Binding is one of the most powerful techniques to implement when building a dashboard, although it is also one of the most difficult as you need to edit the JSON directly. In a few blogs post earlier here and here, you may notice a function call coalesce(), let's see what is this function do before we jump further into the binding technique.

coalesce returns the first non-null source from a list of sources. This function is useful for providing a default value in case function returns a null value.
syntaxcoalesce(source1, source2,...)
example: coalesce(cell(step1.selection, 0, "column1"), "green")
  output: the output is the result returned by cell(step1.selection, 0, "column1"). However, if cell(step1.selection, 0, "column1") returns null, then the output is "green".


We can use cell, row and column function for binding, let's see them below:

cell 
returns a single cell of data as a scalar, like "This salesperson rocks", 2, or null
syntax: cell(source, rowIndex, columnName)

An error occurs if the rowIndex is not an integer, the columnName is not a string, or the cell doesn’t exist in the table.
example: assume that following rows from the step.
    {stateName: ‘CA’, Amount:100}, 
    {stateName: ‘TX’, Amount:200}, 
    {stateName: ‘OR’, Amount:300}, 
    {stateName: ‘AL’, Amount:400}, 

Although Einstein Analytics doesn’t store this data as a table, let’s show the data in this format to make it easier to understand the example
(row index) stateName Amount
0 CA 100 
TX 120 
OR  115 
AL  105 

sample function: cell(myStep.selection, 1, "stateName")
result: "TX"


column
returns one column of data (as a one-dimensional array) or multiple columns of data (as a two-dimensional array) -- allow users to select multiple values.
Syntax: column(source), [columnNames...])

Let's use a similar step with cell functions above
    {stateName: ‘CA’, Amount:100}, 
    {stateName: ‘TX’, Amount:200}, 
    {stateName: ‘OR’, Amount:300}, 
    {stateName: ‘AL’, Amount:400}, 

And here is the presentation virtual table format.
(row index) stateName Amount
0 CA 100 
TX 120 
OR  115 
AL  105 

sample function: column(myStep.selection, "stateName")
result: ["CA", "TX", "OR", "AL"]


Let us continue with binding. There are two types of bindings: selection binding and results binding, the selection or results of one step triggers updates in other steps in the dashboard.

Selection binding is a method used to update a step based on the selection in another step. Selection bindings are interaction-driven, where it’s evaluated each time the user selects something in a widget.

Results binding is a method used to update a step based on the results of another step.

syntax: {{ cell(<stepName>.<result|selection>, <rowIndex>, columnName>).<asString()|asObject()> }}
example: {{ cell(Static_Step_1.selection, 0, \"value\").asString() }}
--> from above is from 1st row
--> {{ }} = binding

syntax: {{ column(<stepName>.<result|selection>, columnName>).asObject() }}
example: {{ column(Static_Step_1.selection, [\"value\"]).asObject() }}

If you saw that coalesce function is to return the first not null function, we can combine it in our cell binding.
{{ coalesce(cell(Static_Step_1.selection, 0, \"value\"), cell(Static_Step_1.result, 0, \"value\")).asString() }}


From the above samples, notice there are 2 other functions always used in binding: asString() and asObject().

asString() function serializes a scalar, one-dimensional array, or two-dimensional array as a string, escapes double quotes in strings.
syntax: <input data>.asString()
example: cell(stepOpportunity.selection, 1, \"measure\").asString()
--> 1 from above is from 2nd row
{{ cell(color_1.result, 0, \"color\").asString() }}


asObject() function passes data through with no serialization, returns data as an object (an array of strings).
syntax: <input data>.asObject()
example:
{{column(static_1.selection, [\"value\"]).asObject()}}
{{cell(static_1.selection, 0, \"value\").asObject()}}



Reference:

Thursday, June 21, 2018

Einstein Analytics: Connect Data Sources and Binding between Dataset

Einstein Analytics allow us to have more than 1 dataset in a dashboard. With Connect Data Source, we can link the dataset using a common field from both datasets, but the field must be used in the dashboard.

Let us using following sample of datasets:



Connect the data source using "Region"



With Connected Data Source, when we select a Region in Target, the Actual Region picklist will be filtered too, if Apply filters from faceting is enabled, this is enabled by default.



The same when we select Actual Region with a value, Target Region will be filtered too -- if "Apply filters from faceting" is enabled.



Without adding Region (actual) dropdown list, Actual value will be still filtered correctly. That's all about Connect Data Source, but in some cases, we can't connect data source, binding is powerful for this, but you need to get your hand dirty with JSON.

Let's see the result first:


Selecting a Region in Target do not filter picklist value in Actual, but the Actual value is changed accordingly, the same as in Connect Data Source.

In this sample, we need to add a filter into the step of Actual value.
This is the original query:
 "query": {
            "measures": [
                    [
                       "sum",
                       "Sales_Actual"
                    ]
            ],
            "groups": [
                       "Sales_Region"
            ]
          }

Add filters to the query:
"query": {
            "measures": [
                    [
                      "sum",
                      "Sales_Actual"
                    ]
            ],
            "groups": [
                       "Sales_Region"
            ],
            "filters": [
                    [
                       "Sales_Region",
                       "{{column(list_Target.selection,[\"Region\"]).asObject()}}",
                       "in"
                    ]
             ]
          }

  • Sales_Region is API field name for the column Region from Actual dataset -- we will need to use that field as filtering
  • list_Target is the step name for Target Region drop-down.
  • Region is the API field name for the column Region from Target dataset.


Reference:


Wednesday, June 13, 2018

Einstein Analytics: Multi-Select Picklist to Text

As per the Summer '18 release, Einstein Analytics is not really friendly with the Multi-Select Picklist field from Salesforce. One of the issues, when you have multi-values in a record, it will show only the 1st value when you show the data as Value Table in a Lense or Table widget in a Dashboard.

Service Type contains multi-values in Salesforce (screenshot from Report)


EA shows the 1st value only of Service Type


As per this document, we can customize JSON dataflow to treat multi-select values as text. Once, you have the dataflow built:
  • Download the JSON file
  • Edit the file and add ,"isMultiValue": false after the field name and save it (you should backup the original JSON file)
  • Upload back the edited JSON file to Dataflow



Here is the result the multi-select values show as text in Einstein Analytics



Note: if you do not update the JSON dataflow as above, the data will still flow into EA as a multi-values field, it will work if you use it as a List filter, and it filters all values (not just the 1st value) for the result, let's see a sample here.

all data

filter North America

filter EMEA


Summer '21 release introduces mv_to_string() function, this function returns an alphabetically sorted, delimited string representation of a multivalue field. But, you need to enable indexing of multivalue fields in Tableau CRM from the setup menu, also notice that 'Indexing multivalue fields can slow down dataflows", so watch your dataflow run once this is enabled. This feature still in Beta as per the Summer '21 release. 



Reference

Sunday, June 10, 2018

Einstein Analytics: Getting started with cogroup

You can combine data from two or more data streams into a single data stream using cogroup. The group and cogroup statements are interchangeable.
The data streams must have at least one common field.
By default, only data that exists in both groups appear in the results.

example:
qs = cogroup qsd by 'State', qsp by 'State';
in this sample, data stream qsd contain field State and data stream qsp also contains field State, we can use it for grouping.

q = cogroup ops by 'Account', meetings by 'Company';
Account in ops data stream should have the same value as Company in the meetings data stream.


Use case: show death per state percentage from 2 datasets.



Let's use cogroup to combine the dataset:
dsd = load "StateDeath2";
dsp = load "StatePopulation2";
ds = cogroup dsd by 'State', dsp by 'State';
ds = foreach ds generate dsp.'State' as 'State', sum(dsp.'Count') as 'Population', sum(dsd.'Count') as 'Death', (sum(dsd.'Count')/sum(dsp.'Count')*100) as 'Death (%)';

The Result


Let's try to use Opportunity and User datasets from our previous blog.
dsu = load "user1";
dso = load "opportunity";
ds = cogroup dsu by 'Id', dso by 'OwnerId';
ds = foreach ds generate first(dsu.Name) as 'Name', sum(dso.Amount) as 'Sum_Amount';
ds = order ds by Name;



If you notice, Angela is not shown on that chart, because she does not have any Opportunity records. Remember that only data that exists in both groups appear in the results.

Wait... there is more, Outer cogrouping combines groups as an outer join. For the half-matches, null rows are added. The grouping keys are taken from the input that provides the value.

result = cogroup rows by expression [left | right | full], rows by expression;

Specify left, right, or full to indicate whether to perform a left outer join, a right outer join, or a full join.
Example: z = cogroup x by (day,origin) left, y by (day,airport);

You can apply an outer cogrouping across more than 2 sets of data. This example does a left outer join from a to b, with a right join to c:
Example: result = cogroup a by keya left, b by keyb right, c by keyc;

Let's use the same SAQL from above and add left outer join.
dsu = load "user1";
dso = load "opportunity";
ds = cogroup dsu by 'Id' left, dso by 'OwnerId';
ds = foreach ds generate first(dsu.Name) as 'Name', sum(dso.Amount) as 'Sum_Amount';
ds = order ds by Name;




Reference:


Saturday, June 9, 2018

Einstein Analytics: Opportunity Dashboard with SAQL Union to show all User

Let's go straight to the business requirements, "show all sales rep with their total sales amount, if there is no opportunity owned by the sales rep, sales rep name must be shown with $0."

High-level solution:
1. Load User master data
2. Load Opportunity data
3. Use UNION to combine the dataset

User data


Opportunity data


Build User master Lens group by Id and Name
q = load "user1";
q = group q by ('Id', 'Name');
q = foreach q generate 'Id' as 'Id', 'Name' as 'Name', count() as 'count';
q = order q by ('Id' asc, 'Name' asc);
q = limit q 2000;

Let's modify necessary SAQL:
1. Rename all q data stream to dsu -- for easier identifier and uniqueness
2. Rename projected 'Id' to 'User_Id', and 'Name' to 'User_Name' -- I'll tell you the reason later
3. Remove 'count' as we do not need it -- User_Id is unique
4. Add 'sum_Amount' with 0 in foreach -- I'll tell you the reason later
5. Remove limit

Here is the result
dsu = load "user1";
dsu = group dsu by ('Id', 'Name');
dsu = foreach dsu generate 'Id' as 'User_Id', 'Name' as 'User_Name', 0 as 'sum_Amount';
dsu = order dsu by 'User_Id';


Build Opportunity Lens group by OwnerId
q = load "opportunity";
q = group q by 'OwnerId';
q = foreach q generate 'OwnerId' as 'OwnerId', sum('Amount') as 'sum_Amount';
q = order q by 'OwnerId' asc;
q = limit q 2000;

Let's modify necessary SAQL:
6. Rename all q data stream to dso -- for easier identifier and uniqueness
7. Rename projected 'OwnerId' to 'User_Id' -- I'll tell you the reason later
8. Add 'User_Name' with "-" in foreach -- I'll tell you the reason later
9. Remove limit

Here is the result
dso = load "opportunity";
dso = group dso by 'OwnerId';
dso = foreach dso generate 'OwnerId' as 'User_Id', "-" as 'User_Name', sum('Amount') as 'sum_Amount';
dso = order dso by 'User_Id';


Combine the dataset with UNION
final = union dsu,dso;
final = group final by ('User_Id');
final = foreach final generate first('User_Name') as 'User_Name', sum('sum_Amount') as 'sum_Amount';


The Complete SAQL
dsu = load "user1";
dsu = group dsu by ('Id', 'Name');
dsu = foreach dsu generate 'Id' as 'User_Id', 'Name' as 'User_Name', 0 as 'sum_Amount';
dsu = order dsu by 'User_Id';

dso = load "opportunity";
dso = group dso by 'OwnerId';
dso = foreach dso generate 'OwnerId' as 'User_Id', "-" as 'User_Name', sum('Amount') as 'sum_Amount';
dso = order dso by 'User_Id';

final = union dsu,dso;
final = group final by ('User_Id');
final = foreach final generate first('User_Name') as 'User_Name', sum('sum_Amount') as 'sum_Amount';


The Moment of Truth



Explanation
  • we rename Id and Name in step (2) to have the same column name with step (7)  and (8)
  • we add 'sum_Amount' in step (4) to have the same column name with dso data stream
  • for our use case, we get the dataset aligned with the same column by adding dummy columns before using UNION to both data stream
  • In the last row, we use aggreagate function first() to return the first user name, as our union  start with dsu which contain user name, while dso at the second/last will always contain "-" for user name, see step (8)
  • In the last row, we also sum the 'sum_Amount' again, practically this is sum the 'sum_Amount'  with 0, remember we add 0 as dummy value in step (4)

Make it simple
Since sum_Amount always 0 in dsu, and User_Name always "-" in dso, we can just simply not need to add them to the data stream, and we will still get the same result, let's remove the unnecessary statement.

dsu = load "user1";
dsu = group dsu by ('Id', 'Name');
dsu = foreach dsu generate 'Id' as 'User_Id', 'Name' as 'Name';

dso = load "opportunity";
dso = group dso by 'OwnerId';
dso = foreach dso generate 'OwnerId' as 'User_Id', sum('Amount') as 'sum_Amount';

final = union dsu,dso;
final = group final by ('User_Id');
final = foreach final generate first('Name') as 'Name', sum('sum_Amount') as 'sum_Amount';
final = order final by 'Name';




Reference:


Monday, June 4, 2018

Einstein Analytics: Filter syntax

Filter Conditions in sfdcDigest

When you do not need to bring the whole data from a Salesforce object to Einstein Analytics (EA), you can filter the data when retrieving it in sfdcDigest.

With filter out unnecessary data flow to EA, this will help our dashboard builders do not need to keep filtering out certain unused data, such as inactive users for the User object. Filtering data also will increase performance on the Dataflow and consume less total of records stored in EA.

In this sample, I want to bring in all active users and not included Chatter users. So, there are 2 filters need to add in sfdcDigest:
- isActive = true
- userType = "Standard"

If you come from Salesforce background, you can simply enter isActive = true && userType = Standard, but this is different in EA. Until Summer '18 release, you need to manually enter this filter conditions in JSON format.

Add this filter in sfdcDigest under Filter Conditions:
[
{"field":"isActive","isQuoted":true,"value":"true","operator":"="},{"field":"UserType","value":"Standard","operator":"="}
]


If you see the first filter, isActive is a boolean field, so passing just true without "" will get an error when running the Dataflow, so we need to add "isQuoted": true, this is also applicable for a numeric and date field.

Let's see this in JSON Dataflow:


Another option is to use Complex Filter Conditions. For the same filter, you can simply enter this condition isActive = True and UserType = 'Standard'


Let's see this in JSON Dataflow:


Tips #1
If you have OR filter in the Complex Filter Conditions, you should add () around the syntax, otherwise, you will hit by an error when running the Dataflow,
example: (isActive = True OR UserType = 'Standard')

Tips #2
Never add Filter Conditions or Complex Filter Conditions if you enable data sync, otherwise, not all data will bring into Einstein Analytics.


Filter syntax in filter Transformation

Use this to filter out data that you do not want with filter node. So, if you have data sync enabled, instead of filter data in sfdcDigest, do the filtering in this filter node. You can use filter node as the Source for another filter node.



sample:
Equal - StageName:EQ:Closed Won
Equal - Opportunity.Partner_Type__c:EQ:Reseller
Range - EmployeeId:R:100:1000
Not Equal - RoleName:N:Manager
OR - Lead Status:EQ:Open|Contacted
AND - Lead Status:EQ:Qualified,Rating:EQ:Hot


From Spring '19 release, you can use SAQL as filter syntax, sample: (Amount > 0)  && (StageName != "Closed Lost")




Blog last updated: 4 Jan 2019



Reference: