Subject/Title: Best practices, key considerations, and epic fails of cloud adoption for an enterprise-level customer
Alternative option: Building a cost-effective architecture in the Azure cloud
Type of Content: Starting a series of blog posts that cover our experience with the cloud technologies and cloud migrations. We need these blog posts to support cloud lunch-and-learn events.
Target audience: IT executives and professionals
Summary of Content: Talk about our positive experience as well as the fails and mistakes that we had during cloud adoption projects.
One of our customers, a leading international wholesale energy market, decided to migrate their database workloads to Microsoft Azure cloud. They turned to DB Best to help develop a cost-effective yet high-performance solution. In this blog post, we will share some of the Azure adoption lessons we learned at this project.
Generally speaking, our main goal is to design and build a Data Warehouse to feed a “decision support system”. This Data Warehouse will be used by an R model to simulate commodity prices to build an outlook for the next 3 months to 3 years range.
However, we started with a more down-to-earth task. First, we redesigned customer’s ETL system in the Azure cloud following the industry security best practices.
Azure cloud migration mistakes and fails
Managing customer’s workloads in the Azure cloud, we encountered the following issues:
- Azure SQL Database Managed Instances
Initially, our customer wanted to use Azure DBMI. Microsoft states that Managed Instance is the best destination to move a large number of existing SQL Server databases from on-premises. Moreover, Microsoft claims that Azure SQL Database Managed Instances deliver “close to 100% compatibility” with SQL Server solutions. So, we had a chance to discover some of these incompatibility issues and limitations. At the end of the day, the customer decided to backtrack this solution due to the limitations with the job scheduling in Microsoft SQL Server Agent. Basically, with Azure SQL DBMI our customer couldn’t reach the desired data refresh rate in the tabular model.
- Azure Analysis Services
Also, we tried leveraging Azure Analysis Services. However, this solution proved to be too expensive for our customer. So, we had to roll back to using tabular models in SQL Server Analysis Services.
- Azure Data Factory
Azure Data Factory proved to be too expensive to use on a regular basis. So, we created the T-SQL procedures to replace the Data Factory pipelines. However, we still utilized ADF for several specific tasks.
- Scalability issues
We haven’t set up the default Azure Location to add the new resources. So, the location for all new resources was selected automatically considering the price of the location. This approach leads to performance issues, especially when it comes to managing huge volumes of data. Finally, we had to migrate all data assets to one location, but the best practice is to specify the location by default to reach the best possible performance.
However, we had a perfect experience when using the following technologies:
- Creating multiple threads in Azure Functions written in C# and using these functions to interact with the database
- Using clustered columnstore indices to work with Analysis Services (Tabular)
- Leveraging In-Memory config tables for parallel data inserts
- Uploading data from huge archive files (over 200 GB in size) using Azure Data Factory. This was probably the most complicated task and we managed to configure it precisely
- We created a T-SQL procedure that calculates the cubic spline. But since the customer decided to utilize R language, which already includes the packages to calculate any type of spline, we terminated using this function
Finally, we provided the customer with the following benefits:
- Set up customer’s Azure infrastructure from scratch, including the network and Azure jumpboxes
- Crafted a near real-time Data Warehouse with data refresh every 10 seconds. Also, we created the ETL system that allows for refreshing data every minute
- We used Azure Functions written in C# to create the ETL procedures with the required triggers to upload data from FTP and web sources