{"id":155566,"date":"2019-09-16T18:56:51","date_gmt":"2019-09-16T22:56:51","guid":{"rendered":"https:\/\/www.countingpips.com\/?p=155566"},"modified":"2019-09-16T18:56:51","modified_gmt":"2019-09-16T22:56:51","slug":"turbocharging-python-with-command-line-tools","status":"publish","type":"post","link":"https:\/\/www.investmacro.com\/forex\/2019\/09\/turbocharging-python-with-command-line-tools\/","title":{"rendered":"Turbocharging Python with Command Line Tools"},"content":{"rendered":"<div id=\"inves-994117762\" class=\"inves-below-title-posts inves-entity-placement\"><div id =\"posts_date_custom\"><div align=\"left\">September 16, 2019<\/div><hr style=\"border: none; border-bottom: 3px solid black;\">\r\n<\/div><\/div><p><strong>By Noah Gift for Kite.com<\/strong><\/p>\n<div class=\"homepage__section\">\n<div class=\"homepage__section__content blog__content\">\n<div class=\"content-block\">\n<h3>Table of Contents<\/h3>\n<ul>\n<li>Introduction<\/li>\n<li>Using The Numba JIT (Just in time Compiler)<\/li>\n<li>Using the GPU with CUDA Python<\/li>\n<li>Running True Multi-Core Multithreaded Python using Numba<\/li>\n<li>KMeans Clustering<\/li>\n<li>Summary<\/li>\n<\/ul>\n<h2><span id=\"introduction\" class=\"blog_contents_anchors\"><\/span>Introduction<\/h2>\n<p>It\u2019s as good a time to be writing code as ever \u2013 these days, a little bit of code goes a long way. Just a single function is capable of performing incredible things. Thanks to GPUs, Machine Learning, the Cloud, and Python, it\u2019s is easy to create \u201cturbocharged\u201d command-line tools. Think of it as upgrading your code from using a basic internal combustion engine to a nuclear reactor. The basic recipe for the upgrade? One function, a sprinkle of powerful logic, and, finally, a decorator to route it to the command-line.<\/p>\n<p>Writing and maintaining traditional GUI applications \u2013 web or desktop \u2013 is a Sisyphean task at best. It all starts with the best of intentions, but can quickly turn into a soul crushing, time-consuming ordeal where you end up asking yourself why you thought becoming a programmer was a good idea in the first place. Why did you run that web framework setup utility that essentially automated a 1970\u2019s technology \u2013 the relational database \u2013 into series of python files? The old Ford Pinto with the exploding rear gas tank has newer technology than your web framework. There has got to be a better way to make a living.<\/p>\n<p>The answer is simple: stop writing web applications and start writing nuclear powered command-line tools instead. The turbocharged command-line tools that I share below are focused on fast results vis a vis minimal lines of code. They can do things like learn from data (machine learning), make your code run 2,000 times faster, and best of all, generate colored terminal output.<\/p>\n<p>Here are the raw ingredients that will be used to make several solutions:<\/p>\n<ul>\n<li><a href=\"https:\/\/click.palletsprojects.com\/en\/7.x\/\" target=\"_blank\" rel=\"noopener noreferrer\">Click Framework<\/a><\/li>\n<li><a href=\"https:\/\/developer.nvidia.com\/how-to-cuda-python\" target=\"_blank\" rel=\"noopener noreferrer\">Python CUDA Framework<\/a><\/li>\n<li><a href=\"http:\/\/numba.pydata.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Numba Framework<\/a><\/li>\n<li><a href=\"http:\/\/scikit-learn.org\/dev\/tutorial\/machine_learning_map\/index.html\" target=\"_blank\" rel=\"noopener noreferrer\">Scikit-learn Machine Learning Framework<\/a><\/li>\n<\/ul>\n<p>You can follow along with source code, examples, and resources in\u00a0<a href=\"https:\/\/github.com\/kiteco\/kite-python-blog-post-code\/tree\/master\/Turbocharging%20Python%20with%20Command%20Line%20Tools\" target=\"_blank\" rel=\"noopener noreferrer\">Kite\u2019s github repository.<\/a><\/p><div id=\"inves-1784778311\" class=\"inves-in-content inves-entity-placement\"><hr style=\"border: 1px solid #ddd;\">\r\n<div id=\"inpost_ads_header\">\r\n<p style=\"font-size:10px; float:left; color:#666;\">Free Reports:<\/p><\/div>\r\n<div id=\"inpost_ads\"> \r\n<p style=\"font-size:15px; float:left;\"><a href=\"https:\/\/goo.gl\/1ApBOV\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/investmacro.com\/wp-content\/uploads\/2018\/06\/graph_techs_PD.png\" align=\"left\" width=\"80\"  height=\"55\"\/><\/a>\r\n\t     <a href=\"https:\/\/goo.gl\/1ApBOV\"><b><u>Get Our Free Metatrader 4 Indicators<\/u><\/b><\/a> - Put Our Free MetaTrader 4 Custom Indicators on your charts when you join our Weekly Newsletter<\/p><br><br>\r\n<br>\r\n<br>\r\n<p style=\"font-size:15px; float:left;\"><a href=\"https:\/\/goo.gl\/f3RrHX\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/investmacro.com\/wp-content\/uploads\/2019\/01\/cot_pie_80.png\" align=\"left\" width=\"80\"  height=\"55\"\/><\/a>\r\n\t    <a href=\"https:\/\/goo.gl\/f3RrHX\"><b><u>Get our Weekly Commitment of Traders Reports<\/u><\/b><\/a> - See where the biggest traders (Hedge Funds and Commercial Hedgers) are positioned in the futures markets on a weekly basis.<\/p><br><br>\r\n<\/div>\r\n<hr style=\"border: 1px solid #ddd;\">\r\n<br><\/div>\n<h2><span id=\"numba-jit\" class=\"blog_contents_anchors\"><\/span>Using The Numba JIT (Just in time Compiler)<\/h2>\n<p>Python has a reputation for slow performance because it\u2019s fundamentally a scripting language. One way to get around this problem is to use the Numba JIT. Here\u2019s what that code looks like:<\/p>\n<p>First, use a timing decorator to get a grasp on the runtime of your functions:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">timing<\/span><span class=\"hljs-params\">(f)<\/span>:<\/span>\r\n<span class=\"hljs-meta\">    @wraps(f)<\/span>\r\n    <span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">wrap<\/span><span class=\"hljs-params\">(*args, **kwargs)<\/span>:<\/span>\r\n        ts = time()\r\n        result = f(*args, **kwargs)\r\n        te = time()\r\n        print(<span class=\"hljs-string\">f'fun: <span class=\"hljs-subst\">{f.__name__}<\/span>, args: [<span class=\"hljs-subst\">{args}<\/span>, <span class=\"hljs-subst\">{kwargs}<\/span>] took: <span class=\"hljs-subst\">{te-ts}<\/span> sec'<\/span>)\r\n        <span class=\"hljs-keyword\">return<\/span> result\r\n    <span class=\"hljs-keyword\">return<\/span> wrap<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>Next, add a numba.jit decorator with the \u201cnopython\u201d keyword argument, and set to true. This will ensure that the code will be run by the JIT instead of regular python.<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-meta\">@timing<\/span>\r\n<span class=\"hljs-meta\">@numba.jit(nopython=True)<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">expmean_jit<\/span><span class=\"hljs-params\">(rea)<\/span>:<\/span>\r\n    <span class=\"hljs-string\">\"\"\"Perform multiple mean calculations\"\"\"<\/span>\r\n\r\n    val = rea.mean() ** <span class=\"hljs-number\">2<\/span>\r\n    <span class=\"hljs-keyword\">return<\/span> val<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>When you run it, you can see both a \u201cjit\u201d as well as a regular version being run via the command-line tool:<\/p>\n<p><code>$ python nuclearcli.py jit-test<\/code><\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\">Running NO JIT\r\nfunc:<span class=\"hljs-string\">'expmean'<\/span> args:[(array([[<span class=\"hljs-number\">1.0000e+00<\/span>, <span class=\"hljs-number\">4.2080e+05<\/span>, <span class=\"hljs-number\">4.2350e+05<\/span>, ..., <span class=\"hljs-number\">1.0543e+06<\/span>, <span class=\"hljs-number\">1.0485e+06<\/span>,\r\n        <span class=\"hljs-number\">1.0444e+06<\/span>],\r\n       [<span class=\"hljs-number\">2.0000e+00<\/span>, <span class=\"hljs-number\">5.4240e+05<\/span>, <span class=\"hljs-number\">5.4670e+05<\/span>, ..., <span class=\"hljs-number\">1.5158e+06<\/span>, <span class=\"hljs-number\">1.5199e+06<\/span>,\r\n        <span class=\"hljs-number\">1.5253e+06<\/span>],\r\n       [<span class=\"hljs-number\">3.0000e+00<\/span>, <span class=\"hljs-number\">7.0900e+04<\/span>, <span class=\"hljs-number\">7.1200e+04<\/span>, ..., <span class=\"hljs-number\">1.1380e+05<\/span>, <span class=\"hljs-number\">1.1350e+05<\/span>,\r\n        <span class=\"hljs-number\">1.1330e+05<\/span>],\r\n       ...,\r\n       [<span class=\"hljs-number\">1.5277e+04<\/span>, <span class=\"hljs-number\">9.8900e+04<\/span>, <span class=\"hljs-number\">9.8100e+04<\/span>, ..., <span class=\"hljs-number\">2.1980e+05<\/span>, <span class=\"hljs-number\">2.2000e+05<\/span>,\r\n        <span class=\"hljs-number\">2.2040e+05<\/span>],\r\n       [<span class=\"hljs-number\">1.5280e+04<\/span>, <span class=\"hljs-number\">8.6700e+04<\/span>, <span class=\"hljs-number\">8.7500e+04<\/span>, ..., <span class=\"hljs-number\">1.9070e+05<\/span>, <span class=\"hljs-number\">1.9230e+05<\/span>,\r\n        <span class=\"hljs-number\">1.9360e+05<\/span>],\r\n       [<span class=\"hljs-number\">1.5281e+04<\/span>, <span class=\"hljs-number\">2.5350e+05<\/span>, <span class=\"hljs-number\">2.5400e+05<\/span>, ..., <span class=\"hljs-number\">7.8360e+05<\/span>, <span class=\"hljs-number\">7.7950e+05<\/span>,\r\n        <span class=\"hljs-number\">7.7420e+05<\/span>]], dtype=float32),), {}] took: <span class=\"hljs-number\">0.0007<\/span> sec<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>$ python nuclearcli.py jit-test \u2013jit<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\">Running <span class=\"hljs-keyword\">with<\/span> JIT\r\nfunc:<span class=\"hljs-string\">'expmean_jit'<\/span> args:[(array([[<span class=\"hljs-number\">1.0000e+00<\/span>, <span class=\"hljs-number\">4.2080e+05<\/span>, <span class=\"hljs-number\">4.2350e+05<\/span>, ..., <span class=\"hljs-number\">1.0543e+06<\/span>, <span class=\"hljs-number\">1.0485e+06<\/span>,\r\n        <span class=\"hljs-number\">1.0444e+06<\/span>],\r\n       [<span class=\"hljs-number\">2.0000e+00<\/span>, <span class=\"hljs-number\">5.4240e+05<\/span>, <span class=\"hljs-number\">5.4670e+05<\/span>, ..., <span class=\"hljs-number\">1.5158e+06<\/span>, <span class=\"hljs-number\">1.5199e+06<\/span>,\r\n        <span class=\"hljs-number\">1.5253e+06<\/span>],\r\n       [<span class=\"hljs-number\">3.0000e+00<\/span>, <span class=\"hljs-number\">7.0900e+04<\/span>, <span class=\"hljs-number\">7.1200e+04<\/span>, ..., <span class=\"hljs-number\">1.1380e+05<\/span>, <span class=\"hljs-number\">1.1350e+05<\/span>,\r\n        <span class=\"hljs-number\">1.1330e+05<\/span>],\r\n       ...,\r\n       [<span class=\"hljs-number\">1.5277e+04<\/span>, <span class=\"hljs-number\">9.8900e+04<\/span>, <span class=\"hljs-number\">9.8100e+04<\/span>, ..., <span class=\"hljs-number\">2.1980e+05<\/span>, <span class=\"hljs-number\">2.2000e+05<\/span>,\r\n        <span class=\"hljs-number\">2.2040e+05<\/span>],\r\n       [<span class=\"hljs-number\">1.5280e+04<\/span>, <span class=\"hljs-number\">8.6700e+04<\/span>, <span class=\"hljs-number\">8.7500e+04<\/span>, ..., <span class=\"hljs-number\">1.9070e+05<\/span>, <span class=\"hljs-number\">1.9230e+05<\/span>,\r\n        <span class=\"hljs-number\">1.9360e+05<\/span>],\r\n       [<span class=\"hljs-number\">1.5281e+04<\/span>, <span class=\"hljs-number\">2.5350e+05<\/span>, <span class=\"hljs-number\">2.5400e+05<\/span>, ..., <span class=\"hljs-number\">7.8360e+05<\/span>, <span class=\"hljs-number\">7.7950e+05<\/span>,\r\n<span class=\"hljs-meta\">@click.option('--jit\/--no-jit', default=False)<\/span>\r\n        <span class=\"hljs-number\">7.7420e+05<\/span>]], dtype=float32),), {}] took: <span class=\"hljs-number\">0.2180<\/span> sec<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>How does that work? Just a few lines of code allow for this simple toggle:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-meta\">@cli.command()<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">jit_test<\/span><span class=\"hljs-params\">(jit)<\/span>:<\/span>\r\n    rea = real_estate_array()\r\n    <span class=\"hljs-keyword\">if<\/span> jit:\r\n        click.echo(click.style(<span class=\"hljs-string\">'Running with JIT'<\/span>, fg=<span class=\"hljs-string\">'green'<\/span>))\r\n        expmean_jit(rea)\r\n    <span class=\"hljs-keyword\">else<\/span>:\r\n        click.echo(click.style(<span class=\"hljs-string\">'Running NO JIT'<\/span>, fg=<span class=\"hljs-string\">'red'<\/span>))\r\n        expmean(rea)<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>In some cases a JIT version could make code run thousands of times faster, but benchmarking is key. Another item to point out is the line:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\">click.echo(click.style(<span class=\"hljs-string\">'Running with JIT'<\/span>, fg=<span class=\"hljs-string\">'green'<\/span>))<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>This script allows for colored terminal output, which can be very helpful it creating sophisticated tools.<\/p>\n<\/div>\n<div class=\"content-block\">\n<h2><span id=\"gpu\" class=\"blog_contents_anchors\"><\/span>Using the GPU with CUDA Python<\/h2>\n<p>Another way to nuclear power your code is to run it straight on a GPU. This example requires you run it on a machine with a CUDA enabled. Here\u2019s what that code looks like:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-meta\">@cli.command()<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">cuda_operation<\/span><span class=\"hljs-params\">()<\/span>:<\/span>\r\n    <span class=\"hljs-string\">\"\"\"Performs Vectorized Operations on GPU\"\"\"<\/span>\r\n\r\n    x = real_estate_array()\r\n    y = real_estate_array()\r\n\r\n    print(<span class=\"hljs-string\">'Moving calculations to GPU memory'<\/span>)\r\n    x_device = cuda.to_device(x)\r\n    y_device = cuda.to_device(y)\r\n    out_device = cuda.device_array(\r\n        shape=(x_device.shape[<span class=\"hljs-number\">0<\/span>],x_device.shape[<span class=\"hljs-number\">1<\/span>]), dtype=np.float32)\r\n    print(x_device)\r\n    print(x_device.shape)\r\n    print(x_device.dtype)\r\n\r\n    print(<span class=\"hljs-string\">'Calculating on GPU'<\/span>)\r\n    add_ufunc(x_device,y_device, out=out_device)\r\n\r\n    out_host = out_device.copy_to_host()\r\n    print(<span class=\"hljs-string\">f'Calculations from GPU <span class=\"hljs-subst\">{out_host}<\/span>'<\/span>)<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>It\u2019s useful to point out is that if the numpy array is first moved to the GPU, then a vectorized function does the work on the GPU. After that work is completed, then the data is moved from the GPU. By using a GPU there could be a monumental improvement to the code, depending on what it\u2019s running. The output from the command-line tool is shown below:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\">$ python nuclearcli.py cuda-operation\r\nMoving calculations to GPU memory\r\n\r\n(<span class=\"hljs-number\">10015<\/span>, <span class=\"hljs-number\">259<\/span>)\r\nfloat32\r\nCalculating on GPU\r\nCalculcations <span class=\"hljs-keyword\">from<\/span> GPU [[<span class=\"hljs-number\">2.0000e+00<\/span> <span class=\"hljs-number\">8.4160e+05<\/span> <span class=\"hljs-number\">8.4700e+05<\/span> ... <span class=\"hljs-number\">2.1086e+06<\/span> <span class=\"hljs-number\">2.0970e+06<\/span> <span class=\"hljs-number\">2.0888e+06<\/span>]\r\n [<span class=\"hljs-number\">4.0000e+00<\/span> <span class=\"hljs-number\">1.0848e+06<\/span> <span class=\"hljs-number\">1.0934e+06<\/span> ... <span class=\"hljs-number\">3.0316e+06<\/span> <span class=\"hljs-number\">3.0398e+06<\/span> <span class=\"hljs-number\">3.0506e+06<\/span>]\r\n [<span class=\"hljs-number\">6.0000e+00<\/span> <span class=\"hljs-number\">1.4180e+05<\/span> <span class=\"hljs-number\">1.4240e+05<\/span> ... <span class=\"hljs-number\">2.2760e+05<\/span> <span class=\"hljs-number\">2.2700e+05<\/span> <span class=\"hljs-number\">2.2660e+05<\/span>]\r\n ...\r\n [<span class=\"hljs-number\">3.0554e+04<\/span> <span class=\"hljs-number\">1.9780e+05<\/span> <span class=\"hljs-number\">1.9620e+05<\/span> ... <span class=\"hljs-number\">4.3960e+05<\/span> <span class=\"hljs-number\">4.4000e+05<\/span> <span class=\"hljs-number\">4.4080e+05<\/span>]\r\n [<span class=\"hljs-number\">3.0560e+04<\/span> <span class=\"hljs-number\">1.7340e+05<\/span> <span class=\"hljs-number\">1.7500e+05<\/span> ... <span class=\"hljs-number\">3.8140e+05<\/span> <span class=\"hljs-number\">3.8460e+05<\/span> <span class=\"hljs-number\">3.8720e+05<\/span>]\r\n [<span class=\"hljs-number\">3.0562e+04<\/span> <span class=\"hljs-number\">5.0700e+05<\/span> <span class=\"hljs-number\">5.0800e+05<\/span> ... <span class=\"hljs-number\">1.5672e+06<\/span> <span class=\"hljs-number\">1.5590e+06<\/span> <span class=\"hljs-number\">1.5484e+06<\/span>]]<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<h2><span id=\"running\" class=\"blog_contents_anchors\"><\/span>Running True Multi-Core Multithreaded Python using Numba<\/h2>\n<p>One common performance problem with Python is the lack of true, multi-threaded performance. This also can be fixed with Numba. Here\u2019s an example of some basic operations:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-meta\">@timing<\/span>\r\n<span class=\"hljs-meta\">@numba.jit(parallel=True)<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">add_sum_threaded<\/span><span class=\"hljs-params\">(rea)<\/span>:<\/span>\r\n    <span class=\"hljs-string\">\"\"\"Use all the cores\"\"\"<\/span>\r\n\r\n    x,_ = rea.shape\r\n    total = <span class=\"hljs-number\">0<\/span>\r\n    <span class=\"hljs-keyword\">for<\/span> _ <span class=\"hljs-keyword\">in<\/span> numba.prange(x):\r\n        total += rea.sum()\r\n        print(total)\r\n\r\n<span class=\"hljs-meta\">@timing<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">add_sum<\/span><span class=\"hljs-params\">(rea)<\/span>:<\/span>\r\n    <span class=\"hljs-string\">\"\"\"traditional for loop\"\"\"<\/span>\r\n\r\n    x,_ = rea.shape\r\n    total = <span class=\"hljs-number\">0<\/span>\r\n    <span class=\"hljs-keyword\">for<\/span> _ <span class=\"hljs-keyword\">in<\/span> numba.prange(x):\r\n        total += rea.sum()\r\n        print(total)\r\n\r\n<span class=\"hljs-meta\">@cli.command()<\/span>\r\n<span class=\"hljs-meta\">@click.option('--threads\/--no-jit', default=False)<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">thread_test<\/span><span class=\"hljs-params\">(threads)<\/span>:<\/span>\r\n    rea = real_estate_array()\r\n    <span class=\"hljs-keyword\">if<\/span> threads:\r\n        click.echo(click.style(<span class=\"hljs-string\">'Running with multicore threads'<\/span>, fg=<span class=\"hljs-string\">'green'<\/span>))\r\n        add_sum_threaded(rea)\r\n    <span class=\"hljs-keyword\">else<\/span>:\r\n        click.echo(click.style(<span class=\"hljs-string\">'Running NO THREADS'<\/span>, fg=<span class=\"hljs-string\">'red'<\/span>))\r\n        add_sum(rea)<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>Note that the key difference between the parallel version is that it uses\u00a0<code>@numba.jit(parallel=True)<\/code>\u00a0and numba.prange to spawn threads for iteration. Looking at the picture below, all of the CPUs are maxed out on the machine, but when almost the exact same code is run without the parallelization, it only uses a core.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-246 lazyloaded\" src=\"https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-1024x507.png\" sizes=\"auto, (max-width: 679px) 100vw, 679px\" srcset=\"https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-1024x507.png 1024w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-300x148.png 300w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-768x380.png 768w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-1000x495.png 1000w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067.png 1237w\" alt=\"\" width=\"679\" height=\"336\" data-src=\"\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-1024x507.png\" data-srcset=\"https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-1024x507.png 1024w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-300x148.png 300w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-768x380.png 768w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067-1000x495.png 1000w, https:\/\/kite.com\/wp-content\/uploads\/2019\/03\/graph.c2ee5067.png 1237w\" data-sizes=\"(max-width: 679px) 100vw, 679px\" \/><\/p>\n<p><code>$ python nuclearcli.py thread-test<\/code><\/p>\n<p><code>$ python nuclearcli.py thread-test --threads<\/code><\/p>\n<h2><span id=\"kmeans\" class=\"blog_contents_anchors\"><\/span>KMeans Clustering<\/h2>\n<p>One more powerful thing that can be accomplished in a command-line tool is machine learning. In the example below, a KMeans clustering function is created with just a few lines of code. This clusters a pandas DataFrame into a default of 3 clusters.<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">kmeans_cluster_housing<\/span><span class=\"hljs-params\">(clusters=<span class=\"hljs-number\">3<\/span>)<\/span>:<\/span>\r\n    <span class=\"hljs-string\">\"\"\"Kmeans cluster a dataframe\"\"\"<\/span>\r\n    url = <span class=\"hljs-string\">'https:\/\/raw.githubusercontent.com\/noahgift\/socialpowernba\/master\/data\/nba_2017_att_val_elo_win_housing.csv'<\/span>\r\n    val_housing_win_df =pd.read_csv(url)\r\n    numerical_df =(\r\n        val_housing_win_df.loc[:,[<span class=\"hljs-string\">'TOTAL_ATTENDANCE_MILLIONS'<\/span>, <span class=\"hljs-string\">'ELO'<\/span>,\r\n        <span class=\"hljs-string\">'VALUE_MILLIONS'<\/span>, <span class=\"hljs-string\">'MEDIAN_HOME_PRICE_COUNTY_MILLIONS'<\/span>]]\r\n    )\r\n    <span class=\"hljs-comment\">#scale data<\/span>\r\n    scaler = MinMaxScaler()\r\n    scaler.fit(numerical_df)\r\n    scaler.transform(numerical_df)\r\n    <span class=\"hljs-comment\">#cluster data<\/span>\r\n    k_means = KMeans(n_clusters=clusters)\r\n    kmeans = k_means.fit(scaler.transform(numerical_df))\r\n    val_housing_win_df[<span class=\"hljs-string\">'cluster'<\/span>] = kmeans.labels_\r\n    <span class=\"hljs-keyword\">return<\/span> val_housing_win_df<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>The cluster number can be changed by passing in another number (as shown below) using click:<\/p>\n<\/div>\n<div class=\"code-block python\">\n<pre><code class=\"Python hljs livecodeserver\"><span class=\"hljs-meta\">@cli.command()<\/span>\r\n<span class=\"hljs-meta\">@click.option('--num', default=3, help='number of clusters')<\/span>\r\n<span class=\"hljs-function\"><span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title\">cluster<\/span><span class=\"hljs-params\">(num)<\/span>:<\/span>\r\n    df = kmeans_cluster_housing(clusters=num)\r\n    click.echo(<span class=\"hljs-string\">'Clustered DataFrame'<\/span>)\r\n    click.echo(df.head())<\/code><\/pre>\n<\/div>\n<div class=\"content-block\">\n<p>Finally, the output of the Pandas DataFrame with the cluster assignment is show below. Note, it has cluster assignment as a column now.<\/p>\n<p><code>$ python -W nuclearcli.py cluster<\/code><\/p>\n<div class=\"text__table full\">\n<table>\n<tbody>\n<tr>\n<td><strong>Clustered DataFrame<\/strong><\/td>\n<td>0<\/td>\n<td>1<\/td>\n<td>2<\/td>\n<td>3<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td><strong>TEAM<\/strong><\/td>\n<td>Chicago Bulls<\/td>\n<td>Dallas Mavericks<\/td>\n<td>Sacramento Kings<\/td>\n<td>Miami Heat<\/td>\n<td>Toronto Raptors<\/td>\n<\/tr>\n<tr>\n<td><strong>GMS<\/strong><\/td>\n<td>41<\/td>\n<td>41<\/td>\n<td>41<\/td>\n<td>41<\/td>\n<td>41<\/td>\n<\/tr>\n<tr>\n<td><strong>PCT_ATTENDANCE<\/strong><\/td>\n<td>104<\/td>\n<td>103<\/td>\n<td>101<\/td>\n<td>100<\/td>\n<td>100<\/td>\n<\/tr>\n<tr>\n<td><strong>WINNING_SEASON<\/strong><\/td>\n<td>1<\/td>\n<td>0<\/td>\n<td>0<\/td>\n<td>1<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td><strong>\u2026<\/strong><\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<\/tr>\n<tr>\n<td><strong>COUNTY<\/strong><\/td>\n<td>Cook<\/td>\n<td>Dallas<\/td>\n<td>Sacremento<\/td>\n<td>Miami-Dade<\/td>\n<td>York-County<\/td>\n<\/tr>\n<tr>\n<td><strong>MEDIAN_HOME_PRICE_COUNTY_MILLIONS<\/strong><\/td>\n<td>269900.0<\/td>\n<td>314990.0<\/td>\n<td>343950.0<\/td>\n<td>389000.0<\/td>\n<td>390000.0<\/td>\n<\/tr>\n<tr>\n<td><strong>COUNTY_POPULATION_MILLIONS<\/strong><\/td>\n<td>5.20<\/td>\n<td>2.57<\/td>\n<td>1.51<\/td>\n<td>2.71<\/td>\n<td>1.10<\/td>\n<\/tr>\n<tr>\n<td><strong>cluster<\/strong><\/td>\n<td>0<\/td>\n<td>0<\/td>\n<td>1<\/td>\n<td>0<\/td>\n<td>0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p><code>$ python -W nuclearcli.py cluster --num 2<\/code><\/p>\n<div class=\"text__table full\">\n<table>\n<tbody>\n<tr>\n<td><strong>Clustered DataFrame<\/strong><\/td>\n<td>0<\/td>\n<td>1<\/td>\n<td>2<\/td>\n<td>3<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td><strong>TEAM<\/strong><\/td>\n<td>Chicago Bulls<\/td>\n<td>Dallas Mavericks<\/td>\n<td>Sacramento Kings<\/td>\n<td>Miami Heat<\/td>\n<td>Toronto Raptors<\/td>\n<\/tr>\n<tr>\n<td><strong>GMS<\/strong><\/td>\n<td>41<\/td>\n<td>41<\/td>\n<td>41<\/td>\n<td>41<\/td>\n<td>41<\/td>\n<\/tr>\n<tr>\n<td><strong>PCT_ATTENDANCE<\/strong><\/td>\n<td>104<\/td>\n<td>103<\/td>\n<td>101<\/td>\n<td>100<\/td>\n<td>100<\/td>\n<\/tr>\n<tr>\n<td><strong>WINNING_SEASON<\/strong><\/td>\n<td>1<\/td>\n<td>0<\/td>\n<td>0<\/td>\n<td>1<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td><strong>\u2026<\/strong><\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<td>\u2026<\/td>\n<\/tr>\n<tr>\n<td><strong>COUNTY<\/strong><\/td>\n<td>Cook<\/td>\n<td>Dallas<\/td>\n<td>Sacremento<\/td>\n<td>Miami-Dade<\/td>\n<td>York-County<\/td>\n<\/tr>\n<tr>\n<td><strong>MEDIAN_HOME_PRICE_COUNTY_MILLIONS<\/strong><\/td>\n<td>269900.0<\/td>\n<td>314990.0<\/td>\n<td>343950.0<\/td>\n<td>389000.0<\/td>\n<td>390000.0<\/td>\n<\/tr>\n<tr>\n<td><strong>COUNTY_POPULATION_MILLIONS<\/strong><\/td>\n<td>5.20<\/td>\n<td>2.57<\/td>\n<td>1.51<\/td>\n<td>2.71<\/td>\n<td>1.10<\/td>\n<\/tr>\n<tr>\n<td><strong>cluster<\/strong><\/td>\n<td>1<\/td>\n<td>1<\/td>\n<td>0<\/td>\n<td>1<\/td>\n<td>1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h2><span id=\"summary\" class=\"blog_contents_anchors\"><\/span>Summary<\/h2>\n<p>The goal of this article is to show how simple command-line tools can be a great alternative to heavy web frameworks. In under 200 lines of code, you\u2019re now able to create a command-line tool that involves GPU parallelization, JIT, core saturation, as well as Machine Learning. The examples I shared above are just the beginning of upgrading your developer productivity to nuclear power, and I hope you\u2019ll use these programming tools to help build the future.<\/p>\n<p>Many of the most powerful things happening in the software industry are based on functions: distributed computing, machine learning, cloud computing (functions as a service), and GPU based programming are all great examples. The natural way of controlling these functions is a decorator-based command-line tool \u2013 not clunky 20th Century clunky web frameworks. The Ford Pinto is now parked in a garage, and you\u2019re driving a shiny new \u201cturbocharged\u201d command-line interface that maps powerful yet simple functions to logic using the Click framework.<\/p>\n<p><i>Noah Gift is lecturer and consultant at both UC Davis Graduate School of Management MSBA program and the Graduate Data Science program, MSDS, at Northwestern. He is teaching and designing graduate machine learning, AI, Data Science courses and consulting on Machine Learning and Cloud Architecture for students and faculty.<\/i><\/p>\n<p><i>Noah\u2019s new book,\u00a0<a href=\"http:\/\/www.informit.com\/store\/pragmatic-ai-an-introduction-to-cloud-based-machine-9780134863863?utm_source=Referral&amp;utm_medium=Kite&amp;utm_campaign=pragai\" target=\"_blank\" rel=\"noopener noreferrer\">Pragmatic AI<\/a>, will help you solve real-world problems with contemporary machine learning, artificial intelligence, and cloud computing tools. Noah Gift demystifies all the concepts and tools you need to get results\u2014even if you don\u2019t have a strong background in math or data science. Save 30% with the code, \u201cKITE\u201d.<\/i><\/p>\n<p class=\"blog__content--footer\">This post is a part of Kite\u2019s new series on Python. You can check out the code from this and other posts on our\u00a0<a href=\"https:\/\/github.com\/kiteco\/kite-python-blog-post-code\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repository<\/a>.<\/p>\n<p><em><strong>About the Author:<\/strong><\/em><\/p>\n<p><a href=\"https:\/\/kite.com\/blog\/python\/python-command-line-tools\/\" target=\"_blank\" rel=\"noopener noreferrer\">This article <\/a>originally appeared on <a href=\"https:\/\/kite.com\" target=\"_blank\" rel=\"noopener noreferrer\">Kite.com<\/a>.<\/p>\n<p>(Reprinted with permission)<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>By Noah Gift for Kite.com Table of Contents Introduction Using The Numba JIT (Just in time Compiler) Using the GPU with CUDA Python Running True Multi-Core Multithreaded Python using Numba KMeans Clustering Summary Introduction It\u2019s as good a time to be writing code as ever \u2013 these days, a little bit of code goes a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-155566","post","type-post","status-publish","format-standard","hentry","no-post-thumbnail"],"_links":{"self":[{"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/posts\/155566","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/comments?post=155566"}],"version-history":[{"count":2,"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/posts\/155566\/revisions"}],"predecessor-version":[{"id":155568,"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/posts\/155566\/revisions\/155568"}],"wp:attachment":[{"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/media?parent=155566"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/categories?post=155566"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.investmacro.com\/forex\/wp-json\/wp\/v2\/tags?post=155566"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}