Give me power, Pegasus! – or the state of Hardware in AI

A bit of history..

It’s no wonder that many years ago, about 6 (in computer terms, that is) some companies started to provide specialized hardware & Software solutions to improve the performance of AI and Machine Learning algorithms, like nvidia with its CUDA platform. This is has been really important in the AI/ML industry as this graph shows:

first.jpg

Basically, an improvement of 33 times the speed of using a normal pc..

But if this graphic was not enough to motivate you to learn more (and get to the end of this article) –  see this other one:

Web_Pascal_Chart-300x300.jpg

This is a graph made on 2016 showcasing the evolution regarding AI processing power since 2012, which the 1X at the bottom is based on an already accelerated GPU for AI processing… which was set as a landmark or baseline on 2012 with Alex Krizhevsky’s study regarding usage of a Deep convolutional neural network that learned automatically to recognice images from 1 million examples. With only two days of training using two NVIDIA GTX 580 GPUs. The study name was “ImageNet Classification with Deep Convolutional Neural Networks”

BANG!!

It’s a BANG! – A big one, which many are calling the new industrial revolution – AI. There, many companies listened, adopting this technology: Baidu, Google, Facebook, Microsoft adopted this for pattern recognition and soon for more..

Between 2011 and 2012, a lot of things happened on AI: Google Brain project achieved amazing results – being able to recognize cats and people by watching movies (though using 2,000 CPU at Google’s giant data center) – then this result was achieved by just 12 NVIDIA GPUs This feat was performed by Bryan Catanzaro from NVIDIA along (my teacher!) Andy NG’s team at Stanford (Yay! I did your course so I can call you teacher :D)

Later on 2012, Alex Krizhevsky from the University of Toronto won the 2012 ImageNet computer image recognition competition, by a HUGE margin, beating image recognition experts. He did NOT write computer vision code. Instead, using Deep Learning, his computer learned to recognice images by itself, they named their neural network AlexNet and trained it with a million example images. This AI bested the best human-coded software.

The AI race was on…

Later on, by 2015, Microsoft and Google beat the best human score in the ImageNet challenge. This means that a DNN (Deep Neural Network) was developed that bested human-level accuracy.

2012 – Deep Learning beats human coded software.

2015 – Deep Learning achieves beats human level accuracy. Basically acquiring “superhuman” levels of perception.

To have an idea, the following graphic shows the acquired accuracy of both Computer Vision and Deep Learning algorithms/models:

ImageNet 2-milestone-web1.gif

Related to this, I wanted to highlight the milestone achieved by Microsoft’s research team on 2016 but before this, let me mention what Microsoft’s chief scientist of speech, Xuedong Huang said on December 2015: “In the next four to five years, computers will be as good as humans” at recognizing the words that come of your mouth.

Well, on October 2016, Microsoft announced a system that can transcribe the contents of a phone call with the same or fewer errors than actual human professionals trained in transcription… Again human perception has been beaten..

Microsoft_built_technology_thats_better-63af79572a1d51c828639d4eba617cf4.jpg

The Microsoft research speech recognition team

These advancements are made possible by the improvement in Deep Learning mainly which is acquired by massive calculation power like 2.000 servers of Google Brain or, as of now, just a few NVIDIA GPUs… this delivers results and results drive the industry and make it trust a technology and, more importantly, bet on it. This is what is has been happening along this years…

Our current AI/ML/DL “Boosters”:

They are essential tools to boost AI (ML, Deep Learning, etc..) and are supported by a day by day increasing number of tools and libraries.. (Caffe, Theano, Torch7, TensorFlow, Keras, MATLAB, etc..) and many companies use them (Microsoft, Google, Baidu, Amazon, Flickr, IBM, Facebook, Netflix, Pinterest, Adobe,… )

An example of this is the Titan Z with 5,760 CUDA cores, 12GB memory and 8 Teraflops

Comparatively, “Google Brain” has 1 billion connections spread over 16,000 cores. This is achievable with $12K with three computers with Titan Z consuming “just” 2,000 KW of power, Ditto.. – oh and if this sounds amazing, this is data from 2014… yeah, I was just teasing you 😉

It gets better…

As of today, we have some solutions already on the consumer market, which you might have in your home computer, like the NVIDIA Pascal based graphic cards:

Nvidia 1080 with 10 Gbps, 2560 NVIDIA CUDA Cores and 8GB GDDR5X memory

NVIDIA Titan Xp with 11 Gbps with 3584 NVIDIA CUDA cores and 12 GB GDDR5X memory

Here is a picture of the beautifully crafted NVIDIA 1080, launched by the end of June 2016:

GeForce_GTX_1080_3qtr_Front_Left.png

And it’s my current graphic card, from when I decided to focus on Machine Learning and Data Science, by the end of 2016 😉 – I am getting ready for you baby! (currently learning Python)

Also, similarly, we have the Quadro family, focused on professional graphic workstations, for professional use. Being their flagship the Quadro P6000 with 3840 CUDA cores 12 Teraflops and 24GB GDDR5X.

And this just got better and better…

I could not help myself reminding myself of this scene from Iron Sky

Recently announced this past 10th of October 2017 we have the Pegasus nvidia drive PX, the autonomous supercomputer for fully autonomous driving, with a passively cooled 10 watts mobile CPU () with four high performance AI processors. Altogether they are able to deliver 320 Trillion Operations per Second (TOPS)

Pegasus! – I personally love the name (I think Mr. Jensen Huang must like the “Zodiac Cavalliers” very much! – as a good geek should ^.^)

I believe these AI processors are two the newest Xavier system-on-a-chip processors coupled with an embedded GPU based on the NVIDIA Volta architecture. The other two seem to be two next generation discrete GPU with hardware explicitly created for accelerated Deep Learning and computer vision algorithms. All in the size of a license plate.. not bad!

Here is a pic of the enormous “Pegasus” powerhorse:

MOD-92899_nvidia-drive-px-pegasus-web.jpg

Cute, right?

This is huge – again yeah. Think that this is basically putting 100 high-end servers in the size of a license plate.. Servers on current Hardware, that is..

And this is powered by…

Volta!

Did I say volta?

This is nvidia’s GPU Architecture which is meant to bring industrialization to AI, and has a wide range of their products supporting this platform. NVIDIA Volta is meant for healthcare, financial, big data & gaming..

This hardware architecture consist of 640 Tensor cores which deliver over 100 Teraflops per second, 5x the previous generation of nvidia’s architecture (Pascal).

DGX systems – AI Supercomputers “a la carte” Based on the just mentioned Volta architecture, having 4x TESLA V100 or the Rack based supercomputer DGX-1 with up to 8 TESLA V100, having each an intel Xeon for each 4 V100. Oh, and all the other hardware boosted to support these massive digital brainpower..

Following some comparative picture to put things in the proper perspective…

nvidia dgx1.png

Here, in the hands of Jensen Huang, who is Nvidia co-founder and CEO, is a Volta V100, if you were wondering:

104520434-RTX38718-Nvidia.530x298.jpg

Smaller than the 100x servers it can beat, right? 

V100 family, along Volta Architecture, were presented just recently this year at Computex, end of May.

Oh, and the market responded extremely well…

Nvidia market share.JPG

They are also empowering IOT solutions for embedded systems, targeting small devices like drones, robots, etc.. to perform video analytics and autonomous AI, which is started becoming a trend now in consumer products..

The family of these products is called NVIDIA Jetson, with its TX2 being their flagship, having 256 CUDA cores and 8GB 128 bit LPDDR4 memory along two CPU (HMP Dual Denver + Quad ARM)

As you can see the race is on, and continues to accelerate and who knows where it will bring us to..

Hope you enjoyed this post, if you liked it, please subscribe 🙂

 

So, what do you think?

Please respond directly on my blog so I do not have to work on recopilating the information from different sources..

Sources:

https://www.nvidia.com/en-us/self-driving-cars/drive-px/

http://www.marketwired.com/press-release/nvidia-announces-worlds-first-ai-computer-to-make-robotaxis-a-reality-nasdaq-nvda-2236493.htm

https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/#source=pr

https://www.nvidia.com/en-us/deep-learning-ai/solutions/

https://www.nvidia.com/en-us/geforce/products/10series/titan-xp/

http://www.nvidia.com/object/embedded-systems.html

http://www.nvidia.com/object/embedded-systems-dev-kits-modules.html

http://www.nvidia.com/object/quadro-graphics-with-pascal.html

https://blogs.nvidia.com/blog/2016/10/24/intelligent-industrial-revolution/

https://blogs.nvidia.com/blog/2016/01/12/accelerating-ai-artificial-intelligence-gpus/

https://www.forbes.com/sites/kevinmurnane/2017/04/11/nvidia-addresses-googles-comparison-of-machine-learning-chips/#7c12d64d56fa

http://www.nvidia.com/object/tesla-servers.html

https://www.nvidia.com/en-us/data-center/tesla-v100/

https://www.cnbc.com/2017/10/13/buy-nvidia-because-a-i-is-killer-app-for-its-chips-analyst.html

https://www.theverge.com/2017/10/10/16449416/nvidia-pegasus-self-driving-car-ai-robotaxi

https://blogs.nvidia.com/blog/2016/01/12/accelerating-ai-artificial-intelligence-gpus/

https://finance.yahoo.com/news/microsoft-built-technology-thats-better-130000704.html

https://www.cnbc.com/quotes/?symbol=NVDA

 

Machine Learning / Data Science / AI / Big Data… There I go!!

Updated 29/11/2017:  I am adding AI programming to ramp up my Python skills and some focus into a gamification site, codingame.com. I have updated the article to reflect this.

Call it as you want.. it is a very fuzzy topic and there are many discussions on the names and concepts 😉

Since from some time, after the “death” of Silverlight, I had an empty space… which was to me the DRIVE, to me this is something exciting that gets me engaged, that pushes and motivates me to go further… it’s when you are in a hackathon and you have this feeling of…

This is it!!

And even .NET Core is an exciting thing with its .NET Standard compliance, And Azure is pretty exciting and improving on a day to day basis, they were not still bringing that “shiny” “Silverlight” factor that pushed me to play and explore with that technology and make it my playground… to devour design and interaction books as well as physics programming just to optimize resources and get to do magic in the UI… what times!!

So, I had two candidates: ML/DS (Machine Learning / Data Science) and AR/VR/MR… and the second is still not mature enough (and it was impossible to get a HoloLens too) I decided earlier this year to go for Machine Learning 🙂 – even you probably have figured it out after reading the title..

I have set up a path on this vast topic which is Data Science, Machine Learning and AI. And, on this path, to learn the best tools for the task in front..

That said, I already worked 2+ years in  ETL (Extract, Transform, Load) to prepare data in a big editorial, as well as in BI & reporting… as other knowledge I can leverage from my experience..

But what is Data Science exactly? (as well as those other buzzwords)

As my understanding goes, these are their meanings/areas:

  • Data Science – The “all goes in” discipline, collecting the data, organizing it, preparing for searching patterns in the data to be able to make advanced “tasks” on it like predictions, classification, etc.. Usually this tasks are the work of a Machine Learning Model that does the magic. Usually this profile has a decent background in Data management, and in defining data flows to integrate the data in a repository where the automated analysis can be done. Also this task requires Math and statistic skills.
  • Machine Learning – Science of creating (or adapting/tuning) algorithms that learn on their own from data (read: can be trained to perform better). Usually a mixed profile of a Matemathician & coder fits this position the best. To say that ML is a subset of Data Science.
  • Deep Learning: To most people, this is a subset of Machine Learning,  which are in fact a ML technique (neural networks). Which has had a lot of success in certain problems and is becoming a discipline on its own.
  • AI – Subfield of Computer Science to program computers that solve human tasks, so they can performing planning, moving, recognizing objects, etc… basically any task. This includes ML as making a prediction on a set of data is, basically, a task. That makes ML a subset of AI, basically. ML has as a goal to make computers handle the task of learning upon data and by themselves, so they can make predictions.

And even I believe this is a clear description, there are people still discussing about this definitions… Here are some more articles that discuss this topic in way more detail, like this one. If you want to understand how wide are the possibilities for a “data scientist” read this.

Some people have several different but similar opinions, and if you have time, you can read some of them. But…

I want to feel the power of DS/ML in my fingertips, know from the top to the bottom how to get things done understanding every single step and to be able to design, code and tune complex models that provide accurate results.. and to be able to explain those models through proper visualizations that provide a clear insight of the decision taken by the model.. And for this,

I have a plan…

Here is My path forward for DS-ML-DL…

Step A: become a Data Science / ML “begginer”

Goal: to become knowledgeable of what is “out there” what is the people using, what are the main technologies and get a feeling on them. Also I likeLove UI and believe that the proper presentation helps greatly to understanding so want to invest a good deal in data presentation skills.

  1. Andy Ng’s Machine Learning – done! – great base but everything done with Mathlab… and no excessive explanation as the exercises were pre-prepared.
  2. Udemy introduction to Data Science – done
  3. EDX program from Microsoft for Data Science– in progress (4 out of 11 courses)
  4. Tableau A to Z (done)

Step B: become a proficient, or at least intermediate, ML developer and DS practitioner:

Goal: To become competent in programming with a hands-on practical approach, both in R and in Python even I believe I will dig in deeper with Python as there is a lot more material in there.

  1. Datacamp.com practicing with some courses in Python, 2 modules completed.
  2. codingame.com practicing to polish my AI agent coding skills (in Python), currently implementing the “intermediate” challenges.
  3. Python A-Z (udemy, Kirill Eremenko)  (Done!)
  4. R a-z (udemy, Kirill Eremenko)
  5. Machine Learning A-Z, hands-on Phython & R (Udemy)
  6. Taming Big Data with Apache Spark & Phyton. (Udemy)

Step C: Become intermediate to advance ML Developer and get some experience:

Goal: do I need to explain? 😉

  1. Ensemble ML
  2. Start digging in on Kaggle, on examples and tutorials to get up to speed and compete in at least one Data Science contest. Ref: https://www.kaggle.com/
    Kaggle is a ML “professional” racing competition so I want to have some ground skills and “driving” experience before joining a competition.
  3. I want this experience to consolidate my learnings all together with hands on experience, with a goal.
  4. Tableau expert top visualization techniques (to get some better knowledge of Tableau)

 

Step D: Get DEEP.

Goal: To get into the most deep and complex topic on today’s Machine Learning panorama, Deep Learning with the new computational advances seems to be key in implementing new approaches of predictive systems, and more – they are being used to develop AI systems able to develop strategies that beat the best humans at a task, to be creative as humans  can be, but without our limitations  – limited cpu power, limited ability to learn and procrastination.. I have setup the following courses

  1. Deep Learning A-Z
  2. Artificial Intelligence A-Z
  3. Join some Kaggle challenges regarding DL and-or AI development.
  4. Deep Learning: GANs and variational Autoencoders
  5. Bayesian Machine Learning in Python
  6. Cluster Analysis and unsupervised ML

Obviously this is a vast topic and things can evolve there or change…

Regarding Kaggle, it is in the right spots I believe. I consider it a way to stablish the learned skills and also getting some valuable experience. see this Quora post:
https://www.quora.com/Can-I-learn-Machine-Learning-completely-with-Kaggle
Also, I love hackathons and coding competitions… participating on these events always gets the best of me and gets me to develop even further than I expected, being that the biggest win – that said, winning or getting in top places does not feel bad at all 😉

And what about Microsoft tech?

well, I do plan to get up to date on all things Microsoft, as on top there is the Microsoft Data Science Orientation, and I have already been playing with Azure Machine Learning Studio, even participating in some competitions while I was performing the Andy Ng Course… I’d like to get hands on and create some content.. I am thinking on some articles on fundamental usage of Azure ML, to show the full usage of AML (create a data integration “data science” workflow, create a model and tune it, create a service and consume it from .NET, for example…
So, do you think such an article (or several articles that show how-to get this done) would be fun/useful?
And… what do you think of the plan? let me know any suggestion you might think of to improve it, I would really appreciate that a lot – I am just beginning 🙂
Update: I forgot to mention that am spicing up the course with a jewel site I found thanks to Microsoft’s Data Science course I am currently performing: http://www.datacamp.com – so some of their trainings will fit in here and there. Also I might consider any of the specializations from udacity later on, and heard that some of the nanodegrees “have it all” from somebody doing the courses… so that could be an option too… 😉