How-to implement a SOLID Watchdog

close up photo of dog wearing sunglasses
Photo by Ilargian Faus on

We all have components in our software or network that we want to monitor how are they behaving.

This is usually called a watchdog component and I was not that happy with some of the implementations and decided to put a bit of my private time on it, as a bit of “technical challenge”.


Quick introduction

Recently in my Company we had to implement a watchdog to monitor hardware resources and their availability.
Reasoning is that network connectivity is “eventual” and we must react to these states (basically we are talking about trains, its wagons and accessing them from central so we have these things that happen like separating and uniting wagons, tunnels and our wonderful networks that work so flawlessly when we need them the most, right?)
But, this might be migrated to another scenarios, like watching out for micro services, identify if they are behaving properly and, if not, restart them if necessary…

I decided to put some fun tech time to get some of the best features I could find on the net and from my colleagues and create something close to the best solution possible to a watchdog system that could notify “some other system” when disconnection and/or re-connection events happen 😉
Yes, I wanted this to be efficient but also as decoupled as possible.

Maybe too much for a single blog post? Maybe you are correct, but come with me till the end of the post to see if I managed to get there…

Let’s get to it!!


An early beginning..

To monitor networking resources we will use a Ping, for this we must use its containing .NET namespace, System.Net.NetworkInformation.

Ping sends an ICMP (Internet Control Message Protocol) echo message to the computer with the specified IP address. We have also a parameter for a timeout in milliseconds. But, we have to note that if this number is very small, the ping can be anyway received even if the timeout ms have elapsed. So it seems a bit “unimportant”.

After we can use its basic construct so let’s try to ping ourselves..

int timeout = 10;
Ping p = new Ping();
PingReply rep = p.Send("", timeout);
if (rep.Status == IPStatus.Success)
       Console.WriteLine("It's alive!");

Not very exciting yet, but it Works (or it should… as we are pinging ourselves… unless we have a very restrictive firewall..)


Now some more timers.. and pinging Asynchronously…

Upon looking for references, Tim Cooker response to a particular post looked to me as the best implementation so far) Ref:

He is using a countdownEvent primitive to synchronise the pingAsync() responses which I particularly liked…

To takeaway is the synchronisation of the Ping.SendAsync() calls is a bit confusing

If you  bring it to Windows Forms, you will find some issues due to the nature of Ping.SendAsync()

Read more on it here:

Basically the Ping tries to raise the PingCompleted on the same thread that SendAsync() was invoked on. But, as we blocked the thread with the countdownEvent, the Ping cannot complete as the thread is blocked. Welcome Deadlock (I am speaking in the case of executing this code on Winforms).

In a console application it Works due that it does not have a synchronisation provider and the PingComplete() is raised in a thread pool thread.

Solution would be to run the code on a worker thread, but this will result that the PingComplete() will also be called on that thread..

Another Pearl on “Ping usage”

I found another article which is a must read:

Here it clarifies what we saw regarding using the Ping.SendAsync(). Basically we have another command on this .NET namespace with an extremely similar name, SendPingAsync(). At a simple view we would think it is redundant and does not sound right… so, what are they doing and what are they different?

According to MSDN

  • SendAsync method Asynchronously attempts to send an Internet Control Message Protocol (ICMP) echo message to a computer, and receive a corresponding ICMP echo reply message from that computer.
  • SendPingAsync Sends an Internet Control Message Protocol (ICMP) echo message to a computer, and receives a corresponding ICMP echo reply message from that computer as an asynchronous operation.

So translating a bit, SendAsync sends the ICMP asynchronously but the reception is not asynchronous. SendPingAsync ensures that the reception is asynchronous.

So, we should use SendPingAsync().


Links to the referenced MSDN sources:


Putting it all together with our “ol’ friend”, the “Timer”..

Initial implementations I have seen for a watchdog were usually following the pattern that a timer is created that will trigger a “watch this resource” task on a given regularity.

This Tick of the timer will trigger the Ping process we have seen.

I got two very good tips on this implementation which are:

  • Once the timer Tick is called, we pause the timer it by setting the timer period to infinite. We resume it only when the tasks to be performed, in this case a single ping, are done. This will ensure no overlap happens and we have no escalation on the number of threads.
  • Set the timer to a random time, to ensure a bit of variability on the execution to help on distributing the load, ie., the pings do not happen at the same time.


Some code:

using System;
using System.Net.NetworkInformation;
using System.Threading;

namespace cappPingWithTimer
    class Program
        static Random rnd = new Random();
        static int min_period = 30000; // in ms
        static int max_period = 50000;
        static Timer t;
        static string ipToPing = "";
        static void Main(string[] args)
            Console.WriteLine("About to set a timer for the ping... press enter to execute >.<");
            Console.WriteLine("Creating new timer and executing it right away...");
            t = new Timer(new TimerCallback(timerTick), null, 0, 0);       
            Console.WriteLine("press Enter to stop execution!");

        private static void timerTick(object state)
            Console.WriteLine("Timer created, first tick here");
            t.Change(Timeout.Infinite, Timeout.Infinite); // this will pause the timer
            Ping p = new Ping();
            p.PingCompleted += new PingCompletedEventHandler(p_PingCompleted);
            Console.WriteLine("Sending the ping inside the timer tick...");
            p.SendAsync(ipToPing, 500, ipToPing);

        private static void p_PingCompleted(object sender, PingCompletedEventArgs e)
            string ip = (string)e.UserState;
            if (e.Reply != null && e.Reply.Status == IPStatus.Success)
                Console.WriteLine("{0} is up: ({1} ms)", ip, e.Reply.RoundtripTime);
            else if (e.Reply == null) //if the IP address is incorrectly specified, the reply object can be null, so it needs to be handled for the code to be resilient..
                Console.WriteLine("Pinging {0} failed. (Null Reply object?)", ip);
                Console.WriteLine("response: {0}", e.Reply.Status.ToString());

            int TimerPeriod = getTimeToWait();
            Console.WriteLine(String.Format("rescheduling the timer to execute in... {0}", TimerPeriod));
            t.Change(TimerPeriod, TimerPeriod); // this will resume the timer

        static int getTimeToWait()
            return rnd.Next(min_period, max_period);


Main issues here is that this technique is meant to trigger a timer “watchdog” for every component/resource we want to monitor, so architecturally wise it might look like an uncontrolled mess that can create a lot of threads..


But this is a Task driven world… since .NET 4.5 at least..

So, now we know how to implement a truly asynchronous ping with SendPingAsync(),  which we can call recurrently with a Timer object instance..

But as of this writing we have better tools in .NET for asynchronous/parallel work.. and it is not Backgroundworker (which would be if we needed to deploy a pre- .NET 4.5 solution)…

…but using Async/Await Tasks, which we have since .NET 4.5 (aka, TAP).

Basically it would become a matter of creating a task for every Ping operation and wait for them to complete in parallel.

And if inside a watchdog, to call it every certain time.. maybe not with Timer… if we have .NET 4.5 we can try to avoid creating extra threads if we can, right?


And now, discussing our Watchdog implementation…

Wouldn’t be ideal to have a decoupled watchdog system that we are able to plug and play anywhere in a SOLID way?

Basically I try to think in simple patterns and provide simple solutions so the first that comes to my mind is implementing the watchdog using an observer pattern that we can register for getting updates on the “connectivity” status of different network resources.

To me only two simple connectivity status matter:

  • Connected
  • Disconnected

So the code can react on this…

We could add “reconnecting” but this would mean that the watchdog knows the implementation of whoever uses it, but we want to decouple (abstract) so this is something that the “user” app should have to manage by itself. So no.

To our watchdog if we get a IP response back, we are connected. And the Consumer app react on this two basic facts. Simple, right?

Another thing we would need is a way to add the elements to be watched by the Watchdog.

For now, I believe this list should have this information for each of the elements to watch over:

  • IP
  • ConnectionStatus
  • LastPingResponse (why not keep the latest ping reply)
  • ElapsedFromLastConnected (from the previous time we had a connection)
  • TotalPings
  • TotalPingsSuccessful


Timer or no Timer? As a timer will create a thread 100% we have also a way to have this done by TAP, and we can make the process cancellable, so the decision is easy.

And we put the Observer pattern in the mix too right?


So, to benefit our “Network” Watchdog subscribers (Observers) we will provide means for:

  • Attach or Register
  • Detach or Unregister
  • Be Notified

Also, we have the fundamental question on how do we want to do this… we can let them subscribe to the full list of resources, if they are centralised, it makes sense or, otherwise, we might have them to be observed by a concrete end component.. and on this case it might only be interested in a single resource.. so… what to do?

Basically I implemented the Subject Interface to both; the resource list and the concrete resource. Then we have a flexible design that fits all functions and fits with proper software quality standards.


The code is simple, a simple Project implementing a .NET Core component with a simple console application that showcases its usage:

img 01

Five interfaces are declared, one for the Network Watchdog so we can:

  1. Add resources to be watched
  2. Configure the Watchdog
  3. Start it
  4. Stop it

Yes,  future one would be to remove resources, but did not think that fully yet. But will probably get that in short.


The other four are the interfaces for the Observer pattern, one for the Subject and other for the Observer. I am more familiar with the concept of having a Publisher and a Subscriber and these names sound better and feel more meaningful to me tan Subject/Observer so I use them instead.

One pair is meant for the Network Watchdog, for all the resources and another one is meant for a concrete Resource itself.

img 02 - interfaces

Then we have an enum with the connectivity state, which holds Connected and Disconnected.

Then the next thing to watch is the implementation of ResourceConnection:

using ConnWatchDog.Interfaces;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Net.NetworkInformation;
using System.Threading.Tasks;

namespace ConnWatchDog
    public class ResourceConnection : IResourcePublisher
        public string IP { get; private set; }
        public ConnectivityState ConnectionState { get; private set; }
        public IPStatus LastStatus { get; private set; } // Technically it can be obtained from the PingReply..
        public PingReply LastPingReply { get; private set; }
        public TimeSpan LastConnectionTime { get; private set; }
        public long TotalPings { get; set; }
        public long TotalSuccessfulPings { get; set; }
        private Stopwatch stopWatch = new Stopwatch();
        public bool StateChanged { get; private set; }

        // Member for Subscriber management
        List<IResourceSubscriber> ListOfSubscribers;

        public ResourceConnection(string ip)
            ConnectionState = ConnectivityState.Disconnected; // first we asume its disconnection until we prove opposite.
            LastStatus = IPStatus.Unknown;
            TotalPings = 0;
            TotalSuccessfulPings = 0;
            IP = ip;
            StateChanged = false;

            ListOfSubscribers = new List<IResourceSubscriber>();

        public void AddPingResult(PingReply pr)
            StateChanged = false;
            LastPingReply = pr;
            LastStatus = pr.Status;

            if (pr.Status == IPStatus.Success)
                LastConnectionTime = stopWatch.Elapsed;

                if (ConnectionState == ConnectivityState.Disconnected)
                    StateChanged = true;
                ConnectionState = ConnectivityState.Connected;
            else // no success..
                if (ConnectionState == ConnectivityState.Connected)
                    StateChanged = true;
                ConnectionState = ConnectivityState.Disconnected; 

            // We trigger the observer event so everybody subscribed gets notified
            if (StateChanged)

        /// <summary>
        ///  Interface implemenation for Observer pattern (IPublisher)
        /// </summary>
        public void RegisterSubscriber(IResourceSubscriber subscriber)
        public void RemoveSubscriber(IResourceSubscriber subscriber)
        public void NotifySubscribers()
            Parallel.ForEach(ListOfSubscribers, subscriber => {


This is a simple beast that implements the Observer pattern for if any other component wants to watch over a concrete resource.


The NetworkWatchdog code:

using ConnWatchDog.Interfaces;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.NetworkInformation;
using System.Threading;
using System.Threading.Tasks;

namespace ConnWatchDog
    public class NetworkWatchdogService : IWatchdogPublisher, INetworkWatchdog
        List<ResourceConnection> ListOfConnectionsWatched;
        List<IWatchdogSubscriber> ListOfSubscribers;
        int RefreshTime = 0;
        int PingTimeout = 0;
        public CancellationToken cancellationToken { get; set; }
        bool NotifyOnlyWhenChanges = false;
        bool IsConfigured = false;

        public NetworkWatchdogService()
            ListOfConnectionsWatched = new List<ResourceConnection>();
            ListOfSubscribers = new List<IWatchdogSubscriber>();

        /// <summary>
        ///  Interface implemenation for Observer pattern (IPublisher)
        /// </summary>
        public void RegisterSubscriber(IWatchdogSubscriber subscriber)
        public void RemoveSubscriber(IWatchdogSubscriber subscriber)
        public void NotifySubscribers()
            Parallel.ForEach(ListOfSubscribers, subscriber => {

        /// <summary>
        ///  Interfaces for the Network Watchdog
        /// </summary>
        public void AddResourceToWatch(string IP)
            ResourceConnection rc = new ResourceConnection(IP);

        public void ConfigureWatchdog(int RefreshTime = 30000, int PingTimeout = 500, bool notifyOnlyWhenChanges = true)
            this.RefreshTime = RefreshTime;
            this.PingTimeout = PingTimeout;
            this.NotifyOnlyWhenChanges = notifyOnlyWhenChanges;
            cancellationToken = new CancellationToken();
            IsConfigured = true;

        public void Start()

        public void Stop()
            cancellationToken = new CancellationToken(true);

        private async void StartWatchdogService()
            var tasks = new List<Task>();

            if (IsConfigured) {
                while (!cancellationToken.IsCancellationRequested)
                    foreach (var resConn in ListOfConnectionsWatched)
                        Ping p = new Ping();
                        var t = PingAndUpdateAsync(p, resConn.IP, PingTimeout);

                    if (this.NotifyOnlyWhenChanges)
                        await Task.WhenAll(tasks).ContinueWith(t =>
                        // now we can send the notification ... if any resources has changed its state from connected <==> disconnected 
                        if (ListOfConnectionsWatched.Any(res => res.StateChanged == true))
                    else NotifySubscribers();

                    // After all resources are monitored, we delay until the next planned execution.
                    await Task.Delay(RefreshTime).ConfigureAwait(false);
                throw new Exception("Cannot start Watchdog not configured");

        private async Task PingAndUpdateAsync(Ping ping, string ip, int timeout)
            var reply = await ping.SendPingAsync(ip, timeout);
            var res = ListOfConnectionsWatched.First(item => item.IP == ip);

Yes, now Reading it again the ListOfConnectionsWatched could be named differently like ListOfNetworkResourcesWatched… so keeping it in mind so I can update that in the Git repo later on.

As a detail, while writing the demo, I thought that in some cases we want all the updates to be notified, so we have a permanent heartbeat that we can bind to a UI. NotifyOnlyWhenChanges does this, otherwise we only send notifications if there has been a change in the connected/disconnected state.


And.. no timer is used, so at the end we are using:

await Task.Delay(RefreshTime).ConfigureAwait(false);

Which does what we want without extra threading.


The last piece of code is an example application which uses the presented software component:

This is an example on how to use the presented software component, my “use case” is “I want a ping sweep over my local home network to see who is responding or not”.

The code creates several resources to be watched and setup the class as a subscriber to the receive connection status update.

For this we have to implement the interface IWatchdogSubscriber and implement the update method.

    public class AsyncPinger : IWatchdogSubscriber
        private string BaseIP = "192.168.178.";
        private int StartIP = 1;
        private int StopIP = 255;
        private string ip;
        private int timeout = 1000;
        private int heartbeat = 10000;
        private NetworkWatchdogService nws;

        public AsyncPinger()
            nws = new NetworkWatchdogService();
            nws.ConfigureWatchdog(heartbeat, timeout, false);

        public async void RunPingSweep_Async()
            var tasks = new List<Task>();

            for (int i = StartIP; i < StopIP; i++)
                ip = BaseIP + i.ToString();

            var cts = new CancellationTokenSource();
            nws.cancellationToken = cts.Token;

        public void Update(List<ResourceConnection> data)
            Console.WriteLine("Update from the Network watcher!");
            foreach (var res in data)
                if (res.ConnectionState == ConnectivityState.Connected)
                    Console.WriteLine("Received from " + res.IP + " total pings: " + res.TotalPings.ToString() + " successful pings: " + res.TotalSuccessfulPings.ToString());
            Console.WriteLine("End of Update ");

Note that the CancellationTokenSource is there for convenience, this will stop the Watchdog after 60 seconds, but we could bound that to the UI or any other logic.

From my main clause, I have only two lines of code needed:

var ap = new AsyncPinger();           

I am happy with the results and time dedicated and I am already thinking on how to potentially extend  such a system like:

  • Extending to another scenarios like monitoring Services
  • Extending to provide custom actions (ping, Service monitor, application monitor, etc…) but this would require as well custom data too.
  • Extending to enable the watchdog to do simple actions like a “keep alive” or in case of a Service issue, to be able on its own to try to solve the issue “stop and restart” protocols…
  • Improve this above points with implementing the Strategy pattern “properly”.


So what do you think? What would you change, add or remove from the proposed solution to make it better?


Happy coding!

The full code with the sample usage can be found here:



NDepend, a review

Shortly ago I got my hands on NDepend, thanks to Patrick Smacchia, lead developer for NDepend, a static code analyzer made for performing static analysis on .NET code

On this blog post, I am doing to explain a bit what is it and what it does

What is NDepend?

As mentioned, a static analysis code for .NET & .NET Core. Static means that the code analysis is performed on the code while it is not being executed.

Usually static code analysis is performed to ensure that the code adheres to some guidelines or metrics, like the number of warnings, certain errors..

Probably, if you work professionally with .NET you have worked with static analyzers from visual studio itself, being the most common Fxcop, Stylecop or the more advanced SonarQube.

That said, the fact is that these code analyzers do not compete with NDepend as they are fundamentally different. In fact, they complement each other.

What is different?

Basically, the rule set implemented by NDepend is essentially different from the other static analyzers, like SonarQube or other Roslyn Analyzers. These are good to analyze what happens in a method, code, syntax and the code flow… whilst NDepend is good at seeings things from a wider, higher-level perspective. It is really focused on analyzing the architecture, OOP structure and implementation, dependencies – where the product name comes from 😉 -, metrics, breaking changes and mutability – and many others too.

The strength of NDepend relies in analyzing software architectures and their components, complexity and interrelation whilst other products strengths are at a different level, focusing more in code modules, being all of them of course excellent products.

NDepend is designed to integrate with some of these products, like SonarQube.

To know more, here

What does it do?

It performs static code analysis on .NET & .NET Core and, upon that, delivers the following information about your code and, importantly, its architecture:

  • Technical Debt Estimation
  • Issue identification
  • Code quality
  • Test Coverage
  • Complexity and diagrams
  • Architecture overview

(Note: it does way more but I’ve shortened to what I think Is important)

And it shows it in a very smart way, let me show you the NDepend Dashboard:

ndepend 01.JPG

Additionally it integrates seamlessly with visual studio, TFS and VSTS. Integrates especially well with the build process, provides the ability to analyze this data over time, comparing builds, test coverage, the build processes.

To know more, here 

Another feature, which is important for communicating to management and reasoning on “passing a milestone” or “fixing the technical debt” (read Technical Debt as the total issues that we leave in the code knowing they are there… but software has to ship, right?). But coming to this, it provides a smart estimation on it.


A quick example

To get some hands on .NET Core I implemented recently a simple service in .NET Core, which I implemented some tests just for fun and also made it asynchronous. Let’s see how it faces the truth! – Just bear in mind it was a just for fun project and time was limited 😉

I’ts quite easy, I followed the steps on the “getting started” video here, installed NDepend, its visual studio plug-in and opened my VS2017, where now appears an NDepend tab.

Let’s open my RomanConverter coding dojo self practice project and click on attach a new NDepend project.

ndepend 02.JPG

The following window appears and we can already click the “play” green button.

ndepend 03.JPG

On the bottom right corner, there is a sphere indicating the status of NDepend. This will start the analysis and the indicator will showcase that it is analyzing.

Once finished, our report will display itself on a browser.

ndepend 03a.JPG

From the Dashboard, click and explore the different information exposed.

A simple click in the Rules Pane, for example, in the violated rules gives us this dashboard:

ndepend 03b.JPG

I find it brilliant, not only the issues are identified but a stacked DataBars are used to showcase the rules with more issues or, with bigger times to fix, as well as having them slightly color identified so understanding which issue(s) are the most critical and deciding which ones to tackle first – or right away – is pretty damn intuitive.

Note to this: I also realized, thanks Patrick for pointing, that clicking on the issues will show them, so what seems like a presentation UI is becoming like a fully interactive dashboard that gets you into action – or to understand the underlying issues better.

There are easily identifiable what our managers would call “low hanging fruit”, easy to fix and saving trouble for later..

Other useful panel is the Queries and Rules explorer which we can open from the circle menu, on the bottom right corner. Or we can use the usual menu: NDepend à Rules à View Explorer Panel.

ndepend 04a.JPG

And it will appear:

ndepend 04.JPG

With this panel, we can explore the rules that the solution has violated, which are grouped into categories like Code Smells, Object Oriented Design, Architecture, Code Coverage, Naming conventions, a predefined “Quality Gates”, Dead Code, and many more… If we click on a rule, we can explore the  C# LINQ Query aka “CQLinq” query that defines it.

This CQLinq attacks a code model dedicated to code quality and can be edited live and also compiled & executed live.

An example of such rule follows:

// <Name>Interfaces must start with an I</Name>
 warnif count > 0 
 Application.Types.Where(t => t.IsInterface && !t.SimpleName.StartsWith("I"))

And it seems damn simple, even to me.. 😉

From here we can quickly access the offensive code and fix it.

Other visualizations that must be explored are the Dependency graph, matrix, and the Metrics view.

All are advanced visualizations which show great insight on our solution and how it is structured. Let’s see them.

Code Metrics View

ndepend 05.JPG

Here we can see that by default we are analyzing our Methods with Lines of Code versus Cyclomatic Complexity. This visualization being a treemap, helps greatly to understand the information.

We can configure the metrics, with a wide selection on level of granularity, size (of the treemap boxes) and color. An especially useful one is code coverage.

An example can be seen next, based on the own Ndepend 😉  source here


Dependency Graph

Here we can see a graph representation, initially on our namespaces which uses LOC as sizing factor for the nodes and the members for the width of the edges connecting them. We can also include the third party assemblies.

ndepend 06.JPG

It is great for getting to know if the interfaces are respected from a Software Architecture viewpoint or, to see if certain assembly is used only where it should be.

I saw no possibility to group several assemblies into one, for example, the explosion of Microsoft.AspNetCore into several assemblies is of no use, I would like to be able to group them into a single Node for example for having the graph more clear. Otherwise this can add noise which can might make other relations I want to visualize harder to detect.. (Update: Patrick Smacchia mentioned that this is on the works – cool!)


The Dependency Matrix

Large solutions would make the previous graph representation a mess, too many namespaces and assemblies. Here we can select namespaces or assemblies and restrict them, drilling down to the elements that we want to dig in, and go to their graph view.

ndepend 07.JPG

There we can select even a method and right click to either see it on a dependency graph view or as a dependency matrix.


What is special about it?

Simply said, its estimation is not only based on source code, but also on the solution level analysis, as mentioned earlier.

Also I mentioned the C# LINQ Queries and it seems to me like a quite flexible approach, everything is interlinked and all the analysis are performed on queries and a lot of the data presented is based on queries apart from the rules: trend, quality gate, searches..

Its visualizations are special, point. Showing the right visualization for the job in a simple, yet efficient, way. Yes, if we are not used to Graphs, Dependency matrixes or tree maps this tool can look intimidating, but don’t be. Once you get used to it, it will become second nature. I used it some years ago to fix a mess and it helped greatly. I did not use it fully though, just for two visualizations.

Other aspect I do really like is that whenever you visualize some information, all relevant information comes along. An example are the Rules! I like the detail that even on the top menu we can see what the current solution is violating.

Or the fact that when I see the rules panel, I see the issues next to the debt and the annual interest and more.

Fundamentally, helps by showing important information where we need it.

Should we use it?

First, to have a visual reference of your project and what is good (or wrong) on it. It can show a lot of things in a very visual way, which can help greatly in:

  1. Knowing the state of our solution
  2. Understanding (the issues, where are they, and the effort it will take to fix them)
  3. Communicating to managers


Concrete features:

  • Enforce Technical Debt measurement in your coding guidelines and processes. This is especially regarding the cost to fix and cost of letting unfixed issues “in”.
  • Understand the entanglement of our software
  • Analyze Software architecture and Code Quality
  • Accurately track the state of our software over time, being able to determine its improvement (or worsening) on its different attributes.


Summarizing – I like it a lot!

It is easy to get lost but it is a rare jewel with an entry barrier that you should push until it’s broken, to see its beauty or, said clearly, its usefulness.

To me its best feature is able to showcase already implemented software architectures in a different, kinesthesic way, with different visualizations tailored to showcase and highlight important aspects of your software.  This is great to see the real state of your code and understand it – and fix it.










Introduction to the Azure Machine Learning Workbench

Following the announcements post published some days ago  here,  we will dig deeper on this new tool, the Workbench. This is also called AML Workbench, which is shorter and this term will be used from now on to refer to Azure Machine Learning Workbench (glad about the acronym as I do not want to type that again :P)


But, what’s the AML Workbench?

It is a desktop application for Windows and MacOS, it has built-in data preparation that learns the data preparation steps as we perform them, which is able take avantage of the best open source frameworks including TensorFlow, Cognitive Toolkit, Spark ML and scikit-learn.

This also means that if you have a GPU that supports AI (read my earlier blog post on the topic here ) you will be benefitting from that power heads-on.

Oh, it has also a command line interface for those who like them 😀


Sounds interesting? Then let’s get started!


Concepts first!

AML – Azure Machine Learning

This is in need to be described at the earliest as it might be a bit confusing. This is a solution proposal from Microsoft that englobes different components and services to provide an integrated end-to-end solution for data science and advanced analytics.

With it we can prepare data, develop experiments and deploy models at cloud scale (read massive scalability here)

AML consists of a few components:

  • AML Workbench – Desktop tool to “do-it-all” from a single location.
  • AML Experimentation Service – I “suppose” this will enable us to validate hypothesis in a protected scenario.
  • AML Model Management Service – I suppose this will enable us to manage our models
  • Microsoft Machine Learning Libraries for Apache Spark (MML Spark Library) – I read Spark/Hadoop integration here, probably to Azure servers
  • Visual Studio Code Tools for AI – I read here R & Python integration with Visual Studio


This picture showcases how AML Workbench fits in the Microsoft AI Ecosystem:

AML intro architec high level.JPG

To say that AML fully integrates with OS (Open Source) initiatives such as scikit-learn, TensorFlow, Microsoft Cognitive Toolkit or Spark ML.

The created experiments can be run in managed environments as Docker containers and clusters running Hadoop with Spark (I am wondering why is Microsoft is only mentioning Spark there if they work together? – Ok! As it was built as an improvement over MapReduce it can also run Stand Alone in the cloud, that’s why!). Also they can use advanced hardware like GPU-enabled VMs in Azure.

AML is built on top of the following technologies:

  • Jupyter Notebooks
  • Apache Spark
  • Docker
  • Kubernetes
  • Python
  • Conda


AML Workbench (yeah, finally!)

Desktop application with a command-line for Windows & MacOS to manage ML solutions through the entire data science life cycle.

  • ETL
  • Model development and experiment management
  • Model Deployment


It provides the following functionalities:

  • Data Preparation that can learn by example (Wow!)
  • Data source abstraction
  • Python SDK for invoking visually constructed data preparation packages (SSIS anyone?)
  • Built-in Jupyter Notebook service and Client UX (like anaconda?)
  • Experiment monitoring and management
  • Role-based access to support sharing and collaboration
  • Automatic project snapshots for each run and version control (trazability on “experiments” at last!) along GIT integration
  • Integration with popular Python IDEs


Let’s install it!

First things first, do you have a computer with windows 10 or macOS Sierra? (I guess you won’t have a Windows Server 2016 at home, do you?) If so proceed.. else go update 😉


Oh, well… before installing we need to set up an ML experimentation account..

Go log-in in the Azure portal here

Click on the new button (+) on the top left corner of the Azure portal and tzpe “machine learning” and select the “Machine Learning experimentation (preview)”

Azure 01 MLE preview.JPG

Click create and fill in the nice form:

Azure 02 MLE preview.JPG

Be sure to select “DevTest” as the cost-saving Model Management pricing tear, otherwise it will have a cost. Dev Test appears as “0.00”. Otherwise you might forget and have a non-pleasant surprise…

Azure 03 MLE preview.JPG

As I am not that much into playing with azure at a personal level (mostly HOLs and learning) I deleted all my resources, including a DB I created at a HOL that suddenly had a cost… luckily very low… and created all the required elements from the ground up. Resource Group, experimentation account, storage account, workspace, model management account, account name… my recommendation is that you keep that data safe and close to you. As this is all protected by Microsoft’s Azure security.

Oh, and Click “Create” to create all this components in the cloud. We should see a “Deployment in progress..” message which should be over in a couple of minutes, as shown in the picture below.

Azure 04 MLE preview.JPG

Also we should see the details of the resources created, storage, resource group, etc… along some useful tools to download (at last) the AML Workbench.

We can also download it from here

And double click it or right click and select “install”.

After the installer loads we should see the gorgeous installer…

Azure 05 MLE preview.JPG

It’s clean, it’s! I meant modern!

As usual, click continue and you will be presented the dependencies and installation path, shown next.

Azure 06 MLE preview.JPG

There I did not like I could not change the installation path… so we can only click the install button… and cross fingers that this does not create any conflict with your Anaconda installation… as this is a clear preview (read here: under your own responsibility)

Oh, I do NOT recommend you to wait for the installation to finish…

…go watch a series or to the gym (as I did in fact) – the installation will take about half an hour to download and install all the required components…


Now AML Workbench (preview) is installed in your computer, congratulations!!

Azure 07 MLE preview.JPG

Note that you can find it at C:\Users\<user>\AppData\Local\AmlWorkbench

Oh, and get used to this icon, I have the feeling we will be visiting it for a while 😉

But let’s continue, we are not yet finished!!


First steps!

So, let’s do something! Baby steps of course..

Start it and log-in to your Azure/Microsoft account. Automatically you should be able to see the recently created workspace in the Azure portal.


Click on the plus sign next to “Projects” panel in the top left or in the text menu, select File and then “New Project…”

We will give our project a name, like “Iris”, then select a directory to save your Azure Machine Learning Projects in your local computer and add a description.

We have to select our workspace, which by default will be the one we just created in the first place.

We will select the template “Classifying Iris” and click on the “Create” button below. This template is a companion sample project for Azure Machine Learning which uses the iris flower dataset.

Azure 07b MLE preview.JPG

Once the project has been created we will see the dashboard of the recently created project.

We can see several sections from our dashboard: the home section of our project, the data sources, notebooks, runs and Files.

On the project dashboard panel we can see a description of our project with instructions on how to set it up and follow the Quick start and tutorials, as well as an execution section.

The Data panel showcases the data sources and the preparations for obtaining them. This is a pretty special section with truly amazing features that can only be found on the AML Workbench, but we will see it on a next post.

It is worth noting that the Notebook panel is basically a Jupyter notebook container, as on the installation there was a custom anaconda installation being made, which also did not seem to tamper with my installation of Anaconda…

We can also open the project in Visual Studio Code or other configured IDE.

If we do not have we can install it now here

On the text menu, select File, then “Configure Project IDE” and input a name and the path for your IDE, which I selected VS Code, as we can see next:

Azure 08 MLE preview.JPG

Once this is installed, we should install Python support for VSCode, so we go to the extensions menu and select one of the Python extensions. In my case I selected the Python 0.7.0 from Don Jayamanne, but this extension seems the most complete.

Azure 09 MLE preview.JPG

Once this is set up we can go to the text menu, click on File, then on “Open Project”, next to it should appear our configured IDE between brackets, “(Visual Studio Code)”. We can see VSCode with the project loaded and we can click on one of the Python source files, for example, We should see the syntax highlighter at work, intellisense also working, between some other features.

Azure 10 MLE preview.JPG

Now, let’s execute it, we can go to the project dashboard panel and select “local”, then the source file “” and add “0.01” in the arguments and click run.

We could also execute it on the Files panel, select the “”

On the right side of AML Workbench, the Jobs pane should appear and showcase the execution(s) that we have started.

While we are at it, we could try some other executions changing the argument to range from 0.001 to 10.

Azure 11 MLE preview.JPG

What we are doing is executing a logistic regression algorithm which uses the scikit-learn library.

Once the different executions have finished, we could go to the Runs panel. There, select “” on the run list and the run history dashboard should show the different runs.

Azure 12 MLE preview.JPG

We can click on the different executions and see the details.

By now we have grasped the concepts of AML and its ecosystem, configured our environment in Azure, installed the Workbench, configured and next created a sample project and executed it locally.

Hope this was a good introduction and you enjoyed it.




Give me power, Pegasus! – or the state of Hardware in AI

A bit of history..

It’s no wonder that many years ago, about 6 (in computer terms, that is) some companies started to provide specialized hardware & Software solutions to improve the performance of AI and Machine Learning algorithms, like nvidia with its CUDA platform. This is has been really important in the AI/ML industry as this graph shows:


Basically, an improvement of 33 times the speed of using a normal pc..

But if this graphic was not enough to motivate you to learn more (and get to the end of this article) –  see this other one:


This is a graph made on 2016 showcasing the evolution regarding AI processing power since 2012, which the 1X at the bottom is based on an already accelerated GPU for AI processing… which was set as a landmark or baseline on 2012 with Alex Krizhevsky’s study regarding usage of a Deep convolutional neural network that learned automatically to recognice images from 1 million examples. With only two days of training using two NVIDIA GTX 580 GPUs. The study name was “ImageNet Classification with Deep Convolutional Neural Networks”


It’s a BANG! – A big one, which many are calling the new industrial revolution – AI. There, many companies listened, adopting this technology: Baidu, Google, Facebook, Microsoft adopted this for pattern recognition and soon for more..

Between 2011 and 2012, a lot of things happened on AI: Google Brain project achieved amazing results – being able to recognize cats and people by watching movies (though using 2,000 CPU at Google’s giant data center) – then this result was achieved by just 12 NVIDIA GPUs This feat was performed by Bryan Catanzaro from NVIDIA along (my teacher!) Andy NG’s team at Stanford (Yay! I did your course so I can call you teacher :D)

Later on 2012, Alex Krizhevsky from the University of Toronto won the 2012 ImageNet computer image recognition competition, by a HUGE margin, beating image recognition experts. He did NOT write computer vision code. Instead, using Deep Learning, his computer learned to recognice images by itself, they named their neural network AlexNet and trained it with a million example images. This AI bested the best human-coded software.

The AI race was on…

Later on, by 2015, Microsoft and Google beat the best human score in the ImageNet challenge. This means that a DNN (Deep Neural Network) was developed that bested human-level accuracy.

2012 – Deep Learning beats human coded software.

2015 – Deep Learning achieves beats human level accuracy. Basically acquiring “superhuman” levels of perception.

To have an idea, the following graphic shows the acquired accuracy of both Computer Vision and Deep Learning algorithms/models:

ImageNet 2-milestone-web1.gif

Related to this, I wanted to highlight the milestone achieved by Microsoft’s research team on 2016 but before this, let me mention what Microsoft’s chief scientist of speech, Xuedong Huang said on December 2015: “In the next four to five years, computers will be as good as humans” at recognizing the words that come of your mouth.

Well, on October 2016, Microsoft announced a system that can transcribe the contents of a phone call with the same or fewer errors than actual human professionals trained in transcription… Again human perception has been beaten..


The Microsoft research speech recognition team

These advancements are made possible by the improvement in Deep Learning mainly which is acquired by massive calculation power like 2.000 servers of Google Brain or, as of now, just a few NVIDIA GPUs… this delivers results and results drive the industry and make it trust a technology and, more importantly, bet on it. This is what is has been happening along this years…

Our current AI/ML/DL “Boosters”:

They are essential tools to boost AI (ML, Deep Learning, etc..) and are supported by a day by day increasing number of tools and libraries.. (Caffe, Theano, Torch7, TensorFlow, Keras, MATLAB, etc..) and many companies use them (Microsoft, Google, Baidu, Amazon, Flickr, IBM, Facebook, Netflix, Pinterest, Adobe,… )

An example of this is the Titan Z with 5,760 CUDA cores, 12GB memory and 8 Teraflops

Comparatively, “Google Brain” has 1 billion connections spread over 16,000 cores. This is achievable with $12K with three computers with Titan Z consuming “just” 2,000 KW of power, Ditto.. – oh and if this sounds amazing, this is data from 2014… yeah, I was just teasing you 😉

It gets better…

As of today, we have some solutions already on the consumer market, which you might have in your home computer, like the NVIDIA Pascal based graphic cards:

Nvidia 1080 with 10 Gbps, 2560 NVIDIA CUDA Cores and 8GB GDDR5X memory

NVIDIA Titan Xp with 11 Gbps with 3584 NVIDIA CUDA cores and 12 GB GDDR5X memory

Here is a picture of the beautifully crafted NVIDIA 1080, launched by the end of June 2016:


And it’s my current graphic card, from when I decided to focus on Machine Learning and Data Science, by the end of 2016 😉 – I am getting ready for you baby! (currently learning Python)

Also, similarly, we have the Quadro family, focused on professional graphic workstations, for professional use. Being their flagship the Quadro P6000 with 3840 CUDA cores 12 Teraflops and 24GB GDDR5X.

And this just got better and better…

I could not help myself reminding myself of this scene from Iron Sky

Recently announced this past 10th of October 2017 we have the Pegasus nvidia drive PX, the autonomous supercomputer for fully autonomous driving, with a passively cooled 10 watts mobile CPU () with four high performance AI processors. Altogether they are able to deliver 320 Trillion Operations per Second (TOPS)

Pegasus! – I personally love the name (I think Mr. Jensen Huang must like the “Zodiac Cavalliers” very much! – as a good geek should ^.^)

I believe these AI processors are two the newest Xavier system-on-a-chip processors coupled with an embedded GPU based on the NVIDIA Volta architecture. The other two seem to be two next generation discrete GPU with hardware explicitly created for accelerated Deep Learning and computer vision algorithms. All in the size of a license plate.. not bad!

Here is a pic of the enormous “Pegasus” powerhorse:


Cute, right?

This is huge – again yeah. Think that this is basically putting 100 high-end servers in the size of a license plate.. Servers on current Hardware, that is..

And this is powered by…


Did I say volta?

This is nvidia’s GPU Architecture which is meant to bring industrialization to AI, and has a wide range of their products supporting this platform. NVIDIA Volta is meant for healthcare, financial, big data & gaming..

This hardware architecture consist of 640 Tensor cores which deliver over 100 Teraflops per second, 5x the previous generation of nvidia’s architecture (Pascal).

DGX systems – AI Supercomputers “a la carte” Based on the just mentioned Volta architecture, having 4x TESLA V100 or the Rack based supercomputer DGX-1 with up to 8 TESLA V100, having each an intel Xeon for each 4 V100. Oh, and all the other hardware boosted to support these massive digital brainpower..

Following some comparative picture to put things in the proper perspective…

nvidia dgx1.png

Here, in the hands of Jensen Huang, who is Nvidia co-founder and CEO, is a Volta V100, if you were wondering:


Smaller than the 100x servers it can beat, right? 

V100 family, along Volta Architecture, were presented just recently this year at Computex, end of May.

Oh, and the market responded extremely well…

Nvidia market share.JPG

They are also empowering IOT solutions for embedded systems, targeting small devices like drones, robots, etc.. to perform video analytics and autonomous AI, which is started becoming a trend now in consumer products..

The family of these products is called NVIDIA Jetson, with its TX2 being their flagship, having 256 CUDA cores and 8GB 128 bit LPDDR4 memory along two CPU (HMP Dual Denver + Quad ARM)

As you can see the race is on, and continues to accelerate and who knows where it will bring us to..

Hope you enjoyed this post, if you liked it, please subscribe 🙂


So, what do you think?

Please respond directly on my blog so I do not have to work on recopilating the information from different sources..



One more to the list.. Tableau it is ;)

Yay! It feels good to C-H-E-C-K !!!

A motivational factor that helps a lot to self commitment, motivation, keeping things organized and productivity – apart from other benefits (see here and here) is to have a checklist 😉

And a plan – or both- combined 😉

In my case, I shared it a few days ago… here is the link to it 

A couple of days ago I finished my Tableau “Basic to Intermediate” course on udemy, “Tableau 10 A-Z: Hands-On Tableau Training for Data Science“.

And yeah, it’s a bit more than a couple of days ago but… I am really happy!

Some days ago, a bit before I finished the Tableau course, I took one day off to go to my company (RDI – Roche Diagnostics International) internal BI conference, as I “officially” do not work on BI nor Data Science, I had to if I wanted to attend… I attended only the Data Visualization day, focused on… what else but Tableau!

If we look at the market positioning through the “Gartner Magic Quadrant for Business Intelligence and analytics Platforms”, we see this:

Research image courtesy of Gartner, Inc.

So, it makes sense to invest some time in Tableau, doesn’t?

Being an expert on usability, user interface design and UX (User eXperience) design & development, which I worked for years… I can say I am having lots of fun with this tool.

And then, back to the Tableau, we had the chance to have some interesting hands-on sessions where we were coding with our laptops on some new features, with pretty tight timed sessions…

To showcase that I was one of the first to raise the hand when the speaker(s) asked if somebody had already finished the exercises, repeatedly being on the initial group of early finishers, from 3 to 7 in a room of around 80 Tableau practitioners… I can’t say anything but that I am extremely happy and satisfied with this course!!!!

To this, it might have helped my over 20 years of Software Engineering or my business analysis skills acquired during my MBA, but most of the direct skills were directly from this course…

To add on, I did not only perform the exercises, but understood them inside-out and tried to apply the techniques to other scenarios, more tailored to my work or needs, playing around. That helped greatly.

So… Thanks Kirill! – I gave you 5/5 stars, which I believe are deserved 😉

I greatly recommend to you all this course and I have purchased the other two courses from Kirill dedicated to Tableau, “Tableau 10 Advanced Training” and “Tableau Expert: Top Visualization Techniques“.


I will probably perform them when I have some more experience and my recently acquired knowledge is laying on more solid ground – but definitely they are on my checklist!

Hope it was interesting and I do recommend you take a peek at Tableau.


Happy Data Visualization!

Jose Luis



Microsoft’s news at Ignite: A lot About AI..

That said I haven’t been at Ignite, but I am overwhelmed by the vast amount of announcements made there and around these dates… and the vast amount of content, 119 talks on AI, 313 sessions on Machine Learning, whoa, it’s getting crazy and the feeling is that all the technologies on Microsoft around AI/Machine Learning/Data Science are accelerating – Fast!

So, let’s catch up!

  • Microsoft ML Server 9.2 released – Microsoft’s Platform offer on Machine Learning and Advanced Analytics for enterprises. As big improvements, now it supports full data science lifecycle support for ETL (Extract Transform and Load operations), supporting R & Python. And yes, this is what was known earlier as Microsoft’s R server, whose name was not fully correct after ‘adopting’ Python 😉 Oh, and now it’s fully integrated with SQL Server 2017… you can read more at the official source. Or watch a quick 2′ introductory video here. I think it is pretty damn important on the full operationalization offer that Microsoft proposes…
  • Azure SQL Database supports now real-time scoring for R and Python
  • Yay! The pretty much expected next gen SQL Database Server from Microsoft has been released: SQL Server 2017, with full support for R & Python and including the before mentioned ML Server.
  • Azure Machine Learning has been greatly updated… this service now brings to us the AML workbench, a client application for AI powered data wrangling and “ML fun”, which I have reserved some time this weekend to download and “have some time together”.. also the AML Experimentation service has been launched to help Data Scientist to experiment faster and better, as well as the AML Model Management  Service. Here you can find an overview as “short” as it can be..
  • Microsoft R Client 3.4.1 release – supporting the obvious version of R, providing desktop capabilities for R development with the ability to deploy models and computations to a ML Server. Original source here. Note that to properly use this, it is vital to use the Visual Studio IDE and to install the R Tools for Visual Studio, which is free.
  • Talking about Visual Studio, we have now the Visual Studio Code Tools For AI which provide extensions to build, test and deploy Deep Learning / AI solutions, being fully integrated with Azure Machine Learning. Github is here.
  • Visual Studio Code for AI has been launched as well, meant for developing for Tensorflow, CNTK, AML and more.
  • The “Quantum Revolution” happened 😉 – the initial movement from Microsoft to embrace the next “shiny light” in computing: Quantum computing. But this is not released, but announced… there will be a Quantum programming language inheriting from C#, Python and F#, a quantum computer simulator and everything integrated with our favorite developer IDE. Short introduction (2′) here. To follow it and suscribe to news on this, go here and on the bottom there is a “signup” button which I’ve already pressed – so, what are you waiting for? 😉

Hope it was interesting!

Let me know if you liked it..  And.. Would you like a hands-on overview on the AML “Workbench”? 😉


Machine Learning / Data Science / AI / Big Data… There I go!!

Updated 29/11/2017:  I am adding AI programming to ramp up my Python skills and some focus into a gamification site, I have updated the article to reflect this.

Call it as you want.. it is a very fuzzy topic and there are many discussions on the names and concepts 😉

Since from some time, after the “death” of Silverlight, I had an empty space… which was to me the DRIVE, to me this is something exciting that gets me engaged, that pushes and motivates me to go further… it’s when you are in a hackathon and you have this feeling of…

This is it!!

And even .NET Core is an exciting thing with its .NET Standard compliance, And Azure is pretty exciting and improving on a day to day basis, they were not still bringing that “shiny” “Silverlight” factor that pushed me to play and explore with that technology and make it my playground… to devour design and interaction books as well as physics programming just to optimize resources and get to do magic in the UI… what times!!

So, I had two candidates: ML/DS (Machine Learning / Data Science) and AR/VR/MR… and the second is still not mature enough (and it was impossible to get a HoloLens too) I decided earlier this year to go for Machine Learning 🙂 – even you probably have figured it out after reading the title..

I have set up a path on this vast topic which is Data Science, Machine Learning and AI. And, on this path, to learn the best tools for the task in front..

That said, I already worked 2+ years in  ETL (Extract, Transform, Load) to prepare data in a big editorial, as well as in BI & reporting… as other knowledge I can leverage from my experience..

But what is Data Science exactly? (as well as those other buzzwords)

As my understanding goes, these are their meanings/areas:

  • Data Science – The “all goes in” discipline, collecting the data, organizing it, preparing for searching patterns in the data to be able to make advanced “tasks” on it like predictions, classification, etc.. Usually this tasks are the work of a Machine Learning Model that does the magic. Usually this profile has a decent background in Data management, and in defining data flows to integrate the data in a repository where the automated analysis can be done. Also this task requires Math and statistic skills.
  • Machine Learning – Science of creating (or adapting/tuning) algorithms that learn on their own from data (read: can be trained to perform better). Usually a mixed profile of a Matemathician & coder fits this position the best. To say that ML is a subset of Data Science.
  • Deep Learning: To most people, this is a subset of Machine Learning,  which are in fact a ML technique (neural networks). Which has had a lot of success in certain problems and is becoming a discipline on its own.
  • AI – Subfield of Computer Science to program computers that solve human tasks, so they can performing planning, moving, recognizing objects, etc… basically any task. This includes ML as making a prediction on a set of data is, basically, a task. That makes ML a subset of AI, basically. ML has as a goal to make computers handle the task of learning upon data and by themselves, so they can make predictions.

And even I believe this is a clear description, there are people still discussing about this definitions… Here are some more articles that discuss this topic in way more detail, like this one. If you want to understand how wide are the possibilities for a “data scientist” read this.

Some people have several different but similar opinions, and if you have time, you can read some of them. But…

I want to feel the power of DS/ML in my fingertips, know from the top to the bottom how to get things done understanding every single step and to be able to design, code and tune complex models that provide accurate results.. and to be able to explain those models through proper visualizations that provide a clear insight of the decision taken by the model.. And for this,

I have a plan…

Here is My path forward for DS-ML-DL…

Step A: become a Data Science / ML “begginer”

Goal: to become knowledgeable of what is “out there” what is the people using, what are the main technologies and get a feeling on them. Also I likeLove UI and believe that the proper presentation helps greatly to understanding so want to invest a good deal in data presentation skills.

  1. Andy Ng’s Machine Learning – done! – great base but everything done with Mathlab… and no excessive explanation as the exercises were pre-prepared.
  2. Udemy introduction to Data Science – done
  3. EDX program from Microsoft for Data Science– in progress (4 out of 11 courses)
  4. Tableau A to Z (done)

Step B: become a proficient, or at least intermediate, ML developer and DS practitioner:

Goal: To become competent in programming with a hands-on practical approach, both in R and in Python even I believe I will dig in deeper with Python as there is a lot more material in there.

  1. practicing with some courses in Python, 2 modules completed.
  2. practicing to polish my AI agent coding skills (in Python), currently implementing the “intermediate” challenges.
  3. Python A-Z (udemy, Kirill Eremenko)  (Done!)
  4. R a-z (udemy, Kirill Eremenko)
  5. Machine Learning A-Z, hands-on Phython & R (Udemy)
  6. Taming Big Data with Apache Spark & Phyton. (Udemy)

Step C: Become intermediate to advance ML Developer and get some experience:

Goal: do I need to explain? 😉

  1. Ensemble ML
  2. Start digging in on Kaggle, on examples and tutorials to get up to speed and compete in at least one Data Science contest. Ref:
    Kaggle is a ML “professional” racing competition so I want to have some ground skills and “driving” experience before joining a competition.
  3. I want this experience to consolidate my learnings all together with hands on experience, with a goal.
  4. Tableau expert top visualization techniques (to get some better knowledge of Tableau)


Step D: Get DEEP.

Goal: To get into the most deep and complex topic on today’s Machine Learning panorama, Deep Learning with the new computational advances seems to be key in implementing new approaches of predictive systems, and more – they are being used to develop AI systems able to develop strategies that beat the best humans at a task, to be creative as humans  can be, but without our limitations  – limited cpu power, limited ability to learn and procrastination.. I have setup the following courses

  1. Deep Learning A-Z
  2. Artificial Intelligence A-Z
  3. Join some Kaggle challenges regarding DL and-or AI development.
  4. Deep Learning: GANs and variational Autoencoders
  5. Bayesian Machine Learning in Python
  6. Cluster Analysis and unsupervised ML

Obviously this is a vast topic and things can evolve there or change…

Regarding Kaggle, it is in the right spots I believe. I consider it a way to stablish the learned skills and also getting some valuable experience. see this Quora post:
Also, I love hackathons and coding competitions… participating on these events always gets the best of me and gets me to develop even further than I expected, being that the biggest win – that said, winning or getting in top places does not feel bad at all 😉

And what about Microsoft tech?

well, I do plan to get up to date on all things Microsoft, as on top there is the Microsoft Data Science Orientation, and I have already been playing with Azure Machine Learning Studio, even participating in some competitions while I was performing the Andy Ng Course… I’d like to get hands on and create some content.. I am thinking on some articles on fundamental usage of Azure ML, to show the full usage of AML (create a data integration “data science” workflow, create a model and tune it, create a service and consume it from .NET, for example…
So, do you think such an article (or several articles that show how-to get this done) would be fun/useful?
And… what do you think of the plan? let me know any suggestion you might think of to improve it, I would really appreciate that a lot – I am just beginning 🙂
Update: I forgot to mention that am spicing up the course with a jewel site I found thanks to Microsoft’s Data Science course I am currently performing: – so some of their trainings will fit in here and there. Also I might consider any of the specializations from udacity later on, and heard that some of the nanodegrees “have it all” from somebody doing the courses… so that could be an option too… 😉