My Raspberry Bramble, Part1

It seems inevitable that anyone that is interested in computer programming eventually becomes interested in the idea of parallel programming. After a little bit of searching you eventually discover the world of cluster computing, only to find that there is really nothing out there to help.

I think this is because cluster computing is still pretty expensive to play with, unless you are a government agency, corporate entity, or educational facility. This leaves me out in the cold…

Or does it. I have started to see more and more blogs and tutorials describing how to build a low-cost cluster out of Raspberry PI single board computers. I have to admit I have been a tremendous fan of the Raspberry PI since it first came out. They run Linux, promote Python as the programming language of choice, and best of all, they only cost $35. But I could never figure out what I would do with one, and then I found clusters, oh sorry, they are called brambles when they are built with Raspberry PI’s.

The Hardware

How cool would it be to have an affordable cluster that you experiment with? I loved the idea, so I went out and purchased the components to build a small 3 node Raspberry PI Bramble.

  • 3 x Raspberry PI 3 @ $35 for 1
  • 3 x 32GB microSD card @ $11 for 1
  • 3 x 1 ft ethernet patch cords @ $12 for 6
  • 3 x 1 ft USB to micro-USB cords for power @ $11 for 10
  • 1 x RavPower 4-port USB charger @ $22 for 1
  • 1 x Netgear 5-port ethernet switch @ $20 for 1
  • 1 x Stackable 4-layer dog-bone case @ $23 for 1

The total price was $226. Not bad for a 3-node bramble!

While I have built my bramble with 3 Raspberry PIs (I will refer to them as nodes from now on) it is not necessary to have more than one. It turns out you can still play with clustering with a single node.

The following sections are a review of the initial steps to configure each node of your bramble. In summary:

  • Flash the microSD card with the OS
  • Add an empty file named ssh on the newly flashed microSD card
  • Put the microSD card into the node
  • Boot the node
  • Login to the node
    • Setup a static ip-address (unique on each node)
    • Run raspi-config
      • Change the password (same on all nodes)
      • Set so ssh server starts on boot (same on all nodes)
      • Change the hostname (unique on each node)
      • Set the memory split to 16 (same on each node)
      • Exit and restart
  • Repeat for each node

Once you have completed each of the above steps you will probably think you now have 1 or more nodes for use in a bramble. But that would be wrong! In the next blog I will cover how to finish the configuration.

What follows is arguably the most confusing aspect of setting up any Raspberry PI, let alone a bramble of them.

I must point out right now that I am a Mac user and all of the steps after this are done using Mac OS, though Linux users should have no difficulty with them. As for Windows users, I apologize, but I do not have the hardware to figure out how to do all of this for you, though I know there are lots of blogs out there that have.

Setup the microSD cards

I have opted to the use raspbian-stretch-lite distribution that is available here. I went with the lite distribution because I plan on using my Mac to ssh into the bramble and am not going to be needing the Graphical User Interface part of the operating system.

To put the operating system on the microSD card, called flashing, I had to mount the card on my computer. Fortunately I have an older Mac that still has an SD card slot. I was able to flash the microSD cards each using the excellent app Etcher that is available here. The microSD card is named boot after the flash finishes.

After flashing, the microSD card is left unmounted. To mount it extract and reinsert it and it will mount and show up on the desktop. Once mounted I performed the following in the terminal touch /Volumes/boot/ssh.

The touch command is used to change the modified timestamp of a file to the current time. In this case there is no file named ssh and so it creates an empty one. Putting this file on the microSD card makes it so the Raspberry Pi it is inserted into starts the ssh server after it boots. But don’t boot it just yet. It is easier to do them all together so be sure to configure each of the microSD cards the same way.

First Boot

I did not want to mess with the whole, “what is my ip-address?” thing, so I connect my first Raspberry PI to my TV with an HDMI cable. My Mac is plugged into one of the ethernet ports of the switch as are all of the Raspberry PI’s. Now I plug the micro USB cables into each of the Raspberry PI’s power connector.

Now I only plug into the USB charger the Raspberry PI that is connected to the TV. As the PI boots, I will be able to watch the progress on the TV. At the very end I can see that the ssh server is being turned on and finally it will say something like ‘the ip-address is xxx.xxx.xxx.xxx’. I am looking for a line above the login prompt when the boot has completed that has 4 numbers separated by periods. This is the ip-address that has been assigned to the PI. Write it down.

This is unquestionably the most confusing part of setting up a cluster. Again, if you are having trouble I refer you to all of the other blogs that discuss how to figure out your ip-address. The point is to get what the current ip-address is so that you can ssh into the PI to finish the initial configuration.

In the Terminal I can now log in using the ip-address I wrote down earlier and then entering the password when prompted:

>> ssh pi@xxx.xxx.xxx.xxx
Password: raspberry

The user is pi by default and the default password is raspberry.

And I am logged in.

At this point I set the node to have a static ip-address. By doing this I have made it so that I do not use the bramble anywhere else, but can always access it from my Mac.

In the Terminal I type sudo nano /etc/dhcpcd.conf, scroll all the way to the bottom of the file and on a new line type static ip_address=xxx.xxx.xxx.xxx/24, where xxx.xxx.xxx.xxx is replace with the ip-address I wrote down earlier, followed by return. Remember each node must have it own unique ip_address.

While I am still logged into the node I type sudo raspi-config followed by enter. This brings up the raspi-config tool interface. Be sure to do the following:

  • change the password (this should be the same on all nodes)
  • set ssh to always on (this should be the same on all nodes)
  • set the hostname (this should be unique on each node)
  • change the memory split to 16 (this can be the same on all nodes)

 

 

Comments Off on My Raspberry Bramble, Part1

Filed under Python, Raspberry PI

Pythonista v3 Workflow

Sometimes I feel like I should be going to a support group or something.

“Hi my name is Mark and I am a mobile Python addict.”

The crowd would respond, “Hi Mark”.

Now I don’t mean to take away from actual support groups or that kind of thing, but I sometimes have the feeling that the people around me don’t understand, and those at work should get it at least a little bit.

I write iOS software for a living and spend all day on a Mac, iPhone, and iPad. And while I enjoy what I do, I have to say that I really prefer to hack away at Python applications, in particular those for which I have a deep interest. This has lead to a bit of frustration on my part though. And a somewhat rocky relationship with my Apple products, the iPhone and iPad in particular. Don’t get me wrong, I love my iDevices, such as they are, but it feels like Apple has tried to make using these devices as development platforms as hard as possible. Oh sure, I hear all the time that they are doing it for the security of the persons information and data, but really? Why can’t I transfer a script file from one application to another? Anyway, I digress.

I have arrived at a passable (read mostly painless but more convoluted than neccessary) workflow for doing my mobile Python development, and I can’t state too clearly or loudly that Pythonista is central to everything I do. That does not mean that there are not other apps that I also need and use, but Pythonista is on the top of that pile.

My workflow is very loosely the following, 1. Write code in Pythonista, 2. Test in Pythonista, 3. Iterate steps 1 and 2 until satisfied, 4. Copy changed files to Clone, 5. Commit changes to GitHub in Clone, 6. If enough has been done push changes to GitHub in Clone otherwise go back to step 1.

I know, you are asking yourself what you missed, where was the big secret? Well, unfortunately, the deal here is that Apple makes it so that this workflow is neccessary by restricting what the app developers are allowed to do in their apps, in particular with reguard to programming languages and scripts. So there you are. Using these devices as development platforms is possible, and even fun, but it could be sooooooooo much more so if Apple would just pull their head out of … well wherever it is, and work with the developers instead of against them.

Now some of you may be coming to this late and are asking yourself, why should I care about Pythonista. Well, if you are developer and have wanted to learn, expand, or just exploit you Python skills then Pythonista is a Python environment for iOS. If you are not a developer then I am flattered that you have read this far, but you may be lost on the internet again.

The latest version of Pythonista is v3.1 as of this writting. And while there are several Python environments available for iOS, Pythonista is, in my oh so humble opinion, the reigning king. In the v3 release the author added support for both Python flavors, which was a huge thing for me. Now I can easily test that my scripts are running in both, and make the effort to support both. (I really must come clean and admit that this is more a resolution than a practice at this point, but I have to start somewhere.) Another detail that is absolutely huge is that Pythonista comes with a collection of Python packages preinstalled. In particular there is numpy, and matplotlib, among others. Lastly, Pythonista has one of the better editors for Python that I have found. While it is not THE best is is very good and is well integrated into the environment. All in all, I am very happy with Pythonista. Check out their website for more info. What blows my mind is that I now have a development environment for my iPad, and can run the scripts that I develop there on my iPhone as well. Mind blown.

The other app that I use in my workflow is Clone, a GitHub client. What I liked most about Clone was it’s editor. Other GitHub clients have only rudimentary editors and the editor in Clone is actually pretty good. This is a big deal in my workflow because I depend on the editors to faithfully copy and paste the code (not to beat a dead horse but this is where the restrictions imposed by Apple interfere with the process). In addition, I also really liked the way Clone works with GitHub. It has a few minor blemishes, but overall is my preferred choice.

Well, that was…theraputic.

In a future post I will review the GitHub client called Working Copy. It looks like I could possibly automate it from Pythonista. Now that would be exciting!

Comments Off on Pythonista v3 Workflow

Filed under Python

Synchronator for Pythonista

Pythonista is the best Python environment currently available for iOS (at least as far as I am concerned). There are other Python synchronization apps for Pythonista, but they all use the original Dropbox API V1, which is deprecated and soon to be discontinued.  The Synchronator module was created using the new V2 API in order to synchronize Python scripts between Pythonista on iOS devices and to backup to Dropbox.

Synchronator is dependent on another module, called DropboxSetup, which saves and loads Dropbox access tokens for use by other Python modules.

The code is available from GitHub at the link https://github.com/markhamilton1/Synchronator/tree/master.

For Synchronator to work properly it needs the latest version of the dropbox Python package, which I use Stash to install. The latest version of the dropbox package has support for both the original V1 API as well as the newer V2 API, which Synchronator needs to operate.

Once these pieces are all in place on your iOS device, you will need to configure Synchronator in order to work. The following steps can be used to do this.

1 – Go to the Dropbox developer web page.

2 – Create an app that uses the Dropbox API V2. (Not the Dropbox for Business API)

3 – Select the App Folder option.

4 – Enter a name for the app. I recommend Synchronator-yourname.

If the previous steps were successful then you have created an app and should now be on the app page where you can edit the properties of the app.

5 – Find the property “Generated Access Token” and select the Generate button.

6 – Select and copy the Access Token to the clipboard.

7 – Execute Synchronator in Pythonista on your iOS device.

8 – Enter the Access Token at the prompt. Paste it if you performed steps 1 thru 6 on the same iOS device that Pythonista is on.

If everything was successful then Synchronator will begin synchronizing with Dropbox.

Comments Off on Synchronator for Pythonista

Filed under iOS, Python

Algorithms: Doubly Linked Lists

As you saw in the previous two posts, lists can be used for a variety of things and depending on the need can be implemented differently. The lists discussed so far are commonly referred to as singly linked lists, referring to that fact that they have only one link in the node element.

Really the only difference between the stack and queue implementations is whether the link is to the next or the previous element in the list. This has a pretty substantial impact on how the list can be traversed. In the stack the result is that traversal is from the youngest element on the stack to the oldest. In the queue the traversal is from oldest to youngest. This is called the natural traverse order, and is directly dependent on the implementation of the node element.

The fact that these lists have only a single link also complicates the process of adding new nodes someplace other than the top (in the case of the stack) or the input or output (in the case of the queue). If we needed to add a new node someplace else in either of these structures we would need to keep references to the previous nodes as we made the traversal to where we wanted to perform the insertion. Clearly, while this can be done, it complicates the code, and possibly increases execution time. The point is that these structures are great for their particular uses, but may become much less useful as our needs change.

We solve this problem with a more general data structure called the doubly linked list. The node of this type of linked list has both a previous link as well as a next link. The most obvious side effect of this is that the list can then be traversed in either direction. More importantly though, we now have the ability to add nodes anywhere in the list that we may need.

If you look at the code for the doubly linked list in the file AlgoLinkedList.py you will notice that the Node class has both a previous and a next link, and that the LinkedList class has both a root and a tail. This combination means that we could start at either end of the list and traverse to the other. More importantly it also means that the list contains all of the information necessary to support the addition of a new node anywhere in the list.

When I implemented this linked list class, I tried to adhere as closely to the Python list definition as possible. This class then can serve as an example of how some of the other aspects of Python are accomplished in addition to illustrating the doubly linked list algorithm.

One more thing I should point out is that in the previous classes the data item to be stored into the data structure was the parameter to the push method, in the linked list, because I have generalized the implementation, you will notice that the methods all operate on the Node object thus requiring that you allocate a Node to hold the actual data. Also there are two methods that can be used to add the Node to the list, insert and append.

As an abstraction, the data is less important than the implementation of the underlying algorithm, and by focusing on the algorithm we can use what what we build in a much wider variety of ways. The down side is that we may loose some performance due to the increase in inherent flexibility.

As before I have provided the source for the linked list as well as unit tests to verify proper operation of the code.

AlgoLinkedList.py

AlgoLinkedListTests.py

As a point of interest, I also include the implementations of a Stack and a Queue using the generalized linked list class at the end of the file. Take time to compare these implementations to those from before. The new versions have the exact same methods but rely almost completely on the linked list class for their implementation.

Welcome to Computer Science with Python!

Comments Off on Algorithms: Doubly Linked Lists

Filed under Python

Algorithms: Queue

The next algorithm I will discuss is the Queue. A Queue is only slightly more complicated than the stack, which is to say, not much. They are used regularly to process “first-in first-out” orders of objects/data.

Think of a queue as a line of people waiting to pay a cashier. One person at a time enters the queue, and they exit the queue in the same order, no matter how long long they have to wait. For a lack of better terms, and to maintain consistency, I am going to use the push and pop semantics discussed in the stack article. An item gets pushed into the queue, and an item gets popped out of the queue.

Each pop will remove an item from the queue until the queue is empty.

The queue algorithm is similar to that of the stack, the only difference being how the link property of the Node is used and the root links maintained by the queue. In the case of the stack, the top link always referred to the first Node on the stack and the link always referred to the next Node in the stack. When an item was pushed onto or popped off of the stack, the top link was used as the point to which the change was made.

In the queue algorithm, there are two root links, one for the input end of the queue and one for the output end. I am going to break with convention and call these root link out and in, rather than head and tail.

As with the stack, this algorithm is all about managing the elements that are added and removed from the stack, but since the elements are removed differently there is additional complexity.

Take a look at the queue algorithm in AlgoQueue.py.

You will notice first the difference in the Element class. Instead of a link to the next Element, this one has a link to the previous Element. In the queue algorithm the links flow from the out link to the in link.

The other obvious change is the that instead of the _top link there is an _in as well as an _out link. When Elements are pushed into the queue, they are added at the _in link. When Elements are popped from the queue they are removed from the _out link.

All of the same methods exist that were defined for the stack algorithm, but push and pop have been changed to handle the queue behavior.

And again, the _cnt member variable is an optimization so that we do not have to traverse the list to get a count.

I have written two Python files for this blog, one that implements the Queue class, and a second one that performs unit tests on the Queue class. They are in the following:

AlgoQueue.py

AlgoQueueTests.py

Though similar, these two algorithms behave very differently.

I have been asked why I would ever implement these algorithms using linked lists, rather than with an array/list. There are 2 primary reasons I might elect to do so.

1. To teach the concept, which is what these algorithm blogs are all about.

2. Efficiency. Algorithms are ALL about trade-offs. There are always many different ways to accomplish a particular task, the question to be answered is which way best suits the current needs.

The stack and queue algorithms that I have implemented so far are all members of the family of linked lists. These constructs are about the flexibility of changing the list. If I were to use an array, I would have to preselect a size to allocate, and once the array was filled, I would have to grow it. Arrays are about random access. The trade-off is in where the performance is achieved.

Linked lists are easier to work with from the perspective of add and remove, as long as a new Element can be allocated it can be added. The overhead is the need for links and the need to navigate the list via those links.

Arrays are easier to work with from the perspective of accessing an element, as long as the array has free space, but when the array is filled, a new one must be allocated, all the Elements from the previous one copied to the new one, etc… The overhead is in maintaining the memory to hold the array.

This has been a very simplistic discussion of the pros and cons, the real point is learning the concept. You can never use what you don’t know. As a software developer, algorithms are the tools that you use to create. As the saying goes, “If all you have is a hammer, then the world seems to be filled with only nails.”

Welcome to Computer Science with Python!

Comments Off on Algorithms: Queue

Filed under Python

Monument Valley

Comments Off on Monument Valley

Filed under Photo

Algorithms: Stack

So often we need to understand how to implement functionality using more complicated algorithms. In this issue, I introduce a very simple implementation of a stack algorithm. Stacks are collection object used regularly to process “first-in last-out” or “last-in first-out” orders of objects/data.

Think of a stack as nothing more than a pile of papers on a table top. Each page is placed on the stack, first one and then the next, etc. This process is referred to as a push operation. Each item is pushed onto the stack onto the items that were already in the stack.

It is then possible to remove the last item placed in the stack from the stack using a pop operation. Each pop removes the next object until the stack is empty.

The stack algorithm is based on a pointer called top which points to the last item pushed on the stack. Each item pushed on the stack is called an element, which is comprised of two variables, a data variable and a next variable. The data variable is where the data is stored that is pushed onto the stack. The next variable refers to the next element which contains the data that was pushed onto the stack for that element, and so on until the last element which has a None in its next variable.

This algorithm is all about managing the elements added to the stack.

Take a look at the stack algorithm in AlgoStack.py.

There are only a few methods needed to implement the stack. The stack is allocated by creating a new instance of the Stack class. Once you have a stack instance you can push data onto the stack, and pop data from the stack.

In addition you can test to see if the stack is empty or how many elements are on the stack. Lastly it is possible to clear the contents of the stack.

You will notice in the implementation that each instance of the class has two member variables: _top and _cnt. _top is the pointer to the top element in the stack. The _cnt variable contains a count of the elements in the stack. It is an optimization to prevent the need of having to traverse the stack to count the elements. And this is as simple as it gets.

I have written two Python files for this particular blog, one that implements the Stack class, and a second one that performs unit tests on the Stack class. They are in the following:

AlgoStack.py

AlgoStackTests.py

Welcome to Computer Science with Python!

Comments Off on Algorithms: Stack

Filed under Python