Git Deployment – Shallow Clone Support in Azure App Services – The missing piece

Azure App Services and the open source project KuduSync behind this great Azure Service is a huge time saver for agile teams. Especially DevOps teams will like the continuous deployment features.  Personally I focus a lot on the Git based deployment which enables you to roll back and forward in seconds whenever it is required. Beside that, it is possible to work with standard tools available on market to implement continuous deployment or integration.

Deployments - Microsoft Azure 2017-07-18 06-48-11

When I started working with Azure App Services building Node.js apps, I wrote a little node package called Azure Deploy. It allowed me to push changes as part of a build process directly into the Azure App Service. Originally, CodeShip was the service of choice for the build process but since I need to support Git Repositories beside GitHub, BitBucket and GitLabs, I migrated to Visual Studio Team Services (VSTS) and the integrated build platform.

vso-build-tasks

After several months and hundreds of deploys, which means hundreds of commits to the local git repository, it became a fairly complex and fat thing. This is normally not a problem but my Azure Deploy package clones the local git repository from Azure App Service to a temp directory and copies the build output over it. Last but not least it commits and pushes the changes back to Azure. The big repository took more than 4 minutes to clone so I was wondering if I can use Shallow Clone to get only the latest state of the repository.

This idea works well on Unix based git servers, on GitHub or even in Visual Studio Team Services as well. But when you try to clone a local Git Repository of Azure App Services via Shallow Clone option

git clone --depth 1 https://github.com/jquery/jquery.git jquery

it ends up with an error. The error and its background is also documented in the GitHub project of KuduSync here. So what to do now?

Another nice option of Azure App Services is the option to pull changes from a Git Repository instantly after a commit. This works well in VSTS, based on GitHooks but also with GitHub and a couple other platforms. It’s also possible to clone via shallow clone flag from these repositories which closes the loop. The final solution is to commit into a VSTS or GitHub hosted publishing repository which triggers a pull deployment in Azure App Services.

At the end this change reduced the whole deployment time from 5 up to 9 minutes, down to approx. 90 seconds. You can find the updated Azure Deploy component in the NPM registry here.

azure-queue-client: delaying jobs made easy

Microsoft Azure offers a very powerful and cheap queueing system, based on Azure Storage. The node module azure-queue-client is a powerful component for node developers in order to interact with the Azure queues easily.    

The updated version of the azure-queue-client now supports delayed jobs. This makes it possible to easily delay a running job in the queue worker for a specific time, .e.g. 5 minutes, 1 hour or any other time less than 7 days in the future.

// config with your settings
var qName = '<<YOURQUEUENAME>>';
var qStorageAccount = '<<YOURACCOUNTNAME>>';
var qStorageSecret = '<<YOURACCOUNTSECRET>>';
var qPolling = 2;
// load the module
var azureQueueClient = new require('../lib/azure-queue-client.js');
// create the listener
var queueListener = new azureQueueClient.AzureQueueListener();
// establish a message handler
queueListener.onMessage(function(message) {
// just logging
 console.log('Message received: ' + JSON.stringify(message));
 console.log('Message Date: ' + new Date());
// generate the delay policy
 var exponentialRetryPolicy = new azureQueueClient.AzureQueueDelayedJobPolicies.ExponentialDelayPolicy(1, 5);
// delay the job
 console.log("Job was delayed " + exponentialRetryPolicy.count(message) + " times");
 console.log("Delaying the job by " + exponentialRetryPolicy.nextTimeout(message) + " seconds");
 return queueListener.delay(message, exponentialRetryPolicy);
});
// start the listening
queueListener.listen(qName, qStorageAccount, qStorageSecret, qPolling, null);

As the code sample shows, the module relies on the concept of delay policies. Implementing custom policies is allowed and supported. Built-in policies are the exponential delay policy and the static delay policy.

The module is actively used and maintained in the azure costs service, so it can be used in production. If you would like to contribute or get more detailed information, please visit the github project page.

Big Data in your browser: Parallel.js

Big Data often has something todo with analysing a big amount of data. The nature of this data makes it possible to split it up into smaller parts and let them be processed from many distributed nodes. Inspired from the team of CrowdProcess we like the idea to use the computing power of a growing web browser grid to solve data analytic problems.

The Azure Cost Monitor does not have the requirement to solve big data problems of user A in the browser of user B, we would never do this because of data privacy but we have a lot of statistic jobs which need to be processed. From an architecture perspective the question comes up why not to use a growing amount of browser based compute nodes connected with our system instead? Starting with this idea we identified that WebWorkers in modern browsers are acting like small and primitive compute nodes in big data networks. The team from the SETI@Home project also gave us the hint that this option works very well to solve big data challenges.

A very simple picture was painted very fast on the board to illustrate our requirements. The user should not be disturbed from the pre-calculation of statistic data in his browser and the whole solution should prevent battery drain and unwanted fan activities:

ParallelJS-Pic01

It’s also important to understand that some smaller devices like a RaspberryPI which is used for internet browsing or an older smartphone is not able to process the job in time to generate a great user experience. Because of this, the picture changed a bit and we invented a principal we call “Preemptive Task Offloading”.

ParallelJS-Pic02

“Preemptive Task Offloading” lives from the idea that the server and the browser are using the same programming language and the same threading subsystem to manage tasks. Because of that the service itself can decide whether it moves tasks in the browser on the end user or pre-calculates them on the server to ensure great user experience.

ParallelJS-Pic03

The illustrated solution is able to improve the user experience for your end users dramatically and lowers the hosting costs for SaaS applications in the same time.


How it works

The first step is to find the lowest common denominator, in our case it’s called JavaScript. Javascript can be executed in all modern browsers and in the server via node.js. Besides this node and web browser has concepts, e.g. WebWorkers to handle multi threading and multi tasking. The second important ingredient is a framework which abstracts the technical handling of  threads or tasks because they are working different in the backend or frontend. We identified parallel.js as a great solution for this because it gives us a common interface to the world of parallel tasks in frontend and backend technologies. Last but not least a system needs to identify the capabilities of the browser. For this we are using two main approaches. The first one tries to identify the capability to spin of web workers and identifies the amount of CPUs. For this we are using the CPU Core Estimator to also support older browsers. The second step of capability negotiation is a small fibonacci calculation to identify how fast the browser really is. If we come to a positive result our system starts the task offloading into the web browser, a negative result leads to a small call against our API to get the preprocessed information from our servers.


Conclusion

After testing this idea several weeks, I can say that this approach helps a lot to build high performance applications, with acceptable costs on the server side. Personally I don’t like the approach to give customer sensitive data into the browser of other customers to much, but I think this approach works great in scientific projects. What do you think about big data approaches in the browser? What are your pitfall or challenges in this area? Just leave a comment bellow or push a message on Twitter.

Azure Cost Monitor announces private and shared filters

The Azure Cost Monitor Team is happy to announce the launch of the new filtering feature, starting today.

Now, users are able to create data filters to get instant access to important or costly services. Filtering enables users to only show results that match a specific criteria. For example a stakeholder may want to only report on a specific tag value or type of cloud service.
It is even possible for team administrators to share created filters with team mates and co-workers, so everyone may stay focused on their cost drivers in the Azure subscriptions.

FilterExpensiveVMs

The system supports a top to bottom “and/or” logic what means that you can create the filters as you would naturally read a sentence. This allows you to combine different attributes in an “and” or “or” clause.


How to get started?
Adding a new filter to the Azure Cost Monitor is this simple:

1) Log In to the Azure Cost Monitor Dashboard and if you don’t have a team account migrate to a team (optional):

team-02-migrate-team

2) Select the “Create Filter” drop down at the spending reports page.

CreateFilter

3) Add several conditions to the filter and save it:

FilterExpensiveVMs

4) Just switch between different filters by selecting the needed one:

SelectFilters

5) If you are the team administrator share a created filter with the team by clicking the “Share with team” button:

ShareFilter


Interested in the filtering feature?
Try the new feature today by simply logging into your Azure Cost Monitor. The feature is part of any plan, starting from the free Basic plan up to the Enterprise plan. 

Any questions, wishes or ideas? Try our feedback portal or drop a mail to help@azure-costs.com.

Azure Table Store: How to backup safely

Microsoft Azure Table store is an amazing, simple, cheap and powerful service of the Microsoft Azure cloud. The service is something between a real NoSQL database and a simple KVP-store. I did many projects in the last time where the Azure Table Store was just a fast read cache or also the whole persistency backend.

As soon as the tables in Azure contain important data for the application provided to customers it’s necessary to think about backup. Microsoft guarantees that the data can not be corrupted on the storage (check out my article about the different storage account options) but accidental deletion, data corruption during automated processes or just by mistake can still happen. Where people are working shit sometimes happens, nobody can change this.

Backing up table stores is not that easy as backing up blob storage based on the idea of stamp copies. Every table needs to be replicated into another table or exported to the blob account. In my current project we searched for the perfect solution and finally came up with the following stack of services: We are using the Azure Cloud Backup from RedGate to export all tables in a GEO redundant storage account. With this solution we get a daily backup of our tables in parallel to Azure SQL backups based on the Microsoft built in features. The backups are stored into a GEO redundant storage account which helps to ensure that we have access to this backups even when one datacenter of Microsoft burns 🙂 or just looses power.

This setup in combination with the Microsoft Backup support for Azure SQL is very powerful and gives everybody the good feeling of being able to recover when chaos happens.

Pit-Falls when you integrate Twitter, Facebook or other social networks

Today I received a request from a platform to vote for a specific product. I was willing todo this, because I’m using it every day and I’m really happy with that.

With this kind of sentences discussions about failed social media integration start. In a world where many people are very concerned about their own data privacy a “but” follows very fast:

Currently I can’t do that because the website enforces me to authorise via Twitter or Facebook. This would be doable if they only request the rights they really need. I will not allow every system to post in my name because that’s very personal.

At the end of the story you as a service provider loose a customer on the click path for only one reason: You tried to get more rights as you typically need to operate your application. And from my own experience the right to post in the name of someone requires a strong trustworthy relationship.

Screen Shot 2015-03-06 at 07.32.16

Neither Twitter nor Facebook enforces any developer to request more rights than they really need. I did a lot of Twitter integrations and whenever you are doing this please check the FAQ of the identity provider of your choice clearly and follow the principal of minimal privileges.

Following this ideas you will loose less people on click path to your voting system!