Import Excel file using Custom API and Alternate keys
For a long time we’ve had alternative keys in Dataverse and Dynamics 365 CE, and I haven’t used them much and not because it’s a bad feature, but because I usually forget about them.
This blogpost is not about an actual project requirement, but about something I like to do in my spare time, and that is to try things I haven’t tried before in the Power Platform.
Let’s imagine we have an integration using an Excel file that is stored in SharePoint. So, let’s say we have a legacy system that doesn’t have an API available to integrate and the only way to get data from this legacy system is in an Excel file. Let’s also say that the legacy system generates the Excel file every day at 10 PM with thousands of contacts.
In a nutshell I will have to retrieve that Excel file from SharePoint, read each row and process it into Dataverse.
We know that there is an Excel connector in Power Automate that we can use to read data from an Excel file, but there are some limitations with this connector, one of them is that we cannot retrieve more than 5000 rows, here you can find more information about this connector:
So to bypass this limitation we have to use a loop action in Power Automate, ok for now let’s say we bypass the limitation, but now let’s think on how we can process all the rows in a more efficient way than using an apply to each action in the cloud flow.
We know that apply to each action is not the best when it comes to performance, and yes we can parallelize the apply to each action in the cloud flow, but you could still end up in a cloud flow running for several minutes or even hours trying to process thousands of rows.
So that’s why I was thinking about using a Custom API and calling it from the cloud flow, we know we can apply a batch UPSERT request inside custom API to create/update contacts in Dataverse, but here we have another limitation with custom API, and that is the two minutes runtime limitation.
As we know, Custom API shares some limitations with plugins, one of them is the two-minute runtime, after which we will get a timeout. So, I have to read the Excel file in batches and send those batches to the Custom API, let’s say batches of 250 rows.
With this approach we can bypass the 5000 row limitation in Excel and the two minute runtime in the Custom API.
Finally, it is worth mentioning that the legacy system does not have the GUID values to easily identify whether or not the contact already exist in Dataverse, so that is where the alternative keys come into play.
I will use the email address (column C in the Excel file) as the alternate key for the contacts table and the account code (column E in the Excel File) as the alternate key for the accounts table. The alternate key for the account table will help me easily to relate the account to the new/existing contact in Dataverse.
Without further explanation let’s see how we can build this solution.
Some details about the Excel file
As I mentioned earlier the Excel file is stored in SharePoint, for this scenario I’ve put the file in the typical Documents library we have in every SharePoint site:
The file is composed by 5400 rows and 8 columns:
First three columns are text type, the Telephone columns is number type, Account column is general type, Role and Gender are text type and Birthdate is Date Time type.
And finally the data in the Excel file is in table format.
Creating the Alternate keys
Here is the first fun part, so within a solution we have to go to the contact table and then select the Keys option:
Click in new, set a name and select the column which in my case is the emailaddress1:
BTW, sorry that some labels in the pictures are in spanish.
Anyway after a couple of minutes the alternate key will be active:
Now we have to do the same for the account table, in this case I will use a custom column called “Sic code” as the alternate key:
If you want to know more about alternate keys here you have the link:
Creating the Cloud Flow
The second part is the cloud flow which is in charge of the loop to get the Excel rows in batches and call the custom API, in my case for testing purposes I am creating a manually triggered cloud flow, in a real business case it should be a scheduled trigger:
Let me explain the above actions step by step, first I declare two variables, one to define the batch size and the second one to define the rows to be skipped in the next row retrieval:
The next action is calling Graph API to get the total number of rows:
To get the total number of Rows you can play around with Graph Explorer:
The next action is to parse the response from the Graph API:
And the next step is to do the iteration using a do until action:
Here the condition is: continue until the SkipRows variable is NOT greater than the total number of rows:
The next action is to calculate the initial row number to be used in the Excel Connector for the next retrieval:
The next step is to use the Excel Connector to get the batch of 250 rows, here we have to specify the start row (Top Count) and the end row (Skip Count):