code clouds

Wednesday, January 22, 2014

A video about Profiling

To paraphrase the video:

Profiling is about making your code faster in an intelligent way. You should make your code "correct" first then worry about making it fast. That's because performance changes invariably make your code maintainability *worse*.

This is really should be required material for OpenStack contributors.

Friday, December 27, 2013

The one about OpenStack and unit tests in PyCharm

If you are working on OpenStack you'll need to work in Linux. Hypothetically, OpenStack code is written in Python and Python code can be written to be platform independent. In practice, much of the Python code in OpenStack's components is written in the most expedient way and that way is often platform specific.

If you are using a virtual machine to run Linux, you might want your native host to run the IDE portion of your development cycle. I've managed to get a hybrid development environment where I use PyCharm as my IDE for development and unit testing along side a standard Linux environment hosted in a Linux machine (using vi for editing and) using devstack for setup and manual functional testing of my code.

On OS X, first prime your development environment.
checkout Nova (or your particular OpenStack project)

$ git clone https://github.com/openstack/nova.git

prime the OpenStack components' virtual environment

$ ./run_tests.sh
when prompted to build a .venv answer 'Y'

Start PyCharm and tell it to open the directory where your new OpenStack project lives.
Once PyCharm has setup the project and has it open for you, modify the project settings
The configuration menu has a large number of settings, we're interested in the settings for a project interpreter:
Once you've selected the Project Interpreter panel, you will see the Python Interpreters that PyCharm knows about. Typically it will only have auto-detected the system-wide Python interpreter. You could just use this one if you wanted, but that would require carefully setting up all the OpenStack dependencies for the interpreter and keeping them up to date with what OpenStack needs. Instead, we'll set up the .venv based interpreter ./run_tests.sh just built for you as a PyCharm interpreter.
The lower left hand corner of the dialog will have a "+" symbol for adding a new interpreter:
You'll be prompted to select a path to the interpreter, the prompt may have the path you are interested in already listed, but if it doesn't you'll have to select "Local..."
Next you'll see the familiar file selector dialog. You'll need to select the file .venv/bin/python which will require you to navigate your file system to find the OpenStack project you just checked out.

once you have it, choose it
You'll next be prompted on how you want to add the Virtualenv, and we'll probably want to not only make this interpreter available for this PyCharm project, but also any PyCharm project we work on. You really only need to select the top check-box, but the bottom box won't hurt the current project any.
When you are done you'll see a listing like this one for your PyCharm configured interpreters.
You will also have to manually install some packages such as Nose by selecting "Install" for the selected interpreter and choosing Nose related packages.

Close the package install dialog with the red "x" button in the upper left hand corner.
You are now ready to run tests locally. Many unit tests lean on environment variables. You may be forced to edit your Python test runner configurations. You modify the default test runner by accessing the "Edit Configurations..." dialog here:

Next you set the Environment variables here...

... I've edited my Environment Variables by clicking the "..." button ...

... and adding the EVENTLET_NO_GREENDNS variable...

You may have to use menu File -> Invalidate Caches and restart PyCharm for it to fully accept the new Python interpreter.

You'll have a new problem... how do you move changes between your Host OS and your Guest OS? You could always check in WIP patches and pull them down to the different environments, but that's a lot of extra noise on the review system. Why not just pull the patches between your two OSes by sharing the file system of the Guest with the Host?

For that you'll need some extra setup and some osascript hacking. I'll cover that in a different post.

Thursday, December 26, 2013

The one about how "Nothing is Virtual"

Nothing is Virtual because everything eventually makes the journey down to what I call bits on metal. At some point every virtual machine, every hypervisor, and every network becomes real bits represented by real signals traveling on real fiber and real wires.

Anyone who's worked with me for a while knows that I am never content with merely a working theory of how a system works. I have an implicit desire to look at the system's fundamentals and what it is that under lies an application. I've had the good fortune to have been given a solid grounding in computing theory and practical application early in my career and this lets me get beneath the surface issues most developers focus on.

This is one of the reasons I ended up hacking the Linux kernel, working on drivers, and understanding virtualization technologies even though I was a "web developer", "database administrator", or some other stripe of application developer. Knowing what happened beneath the application made me better at the game of creating working applications that could scale and troubleshooting applications in the wild. It's one of the reasons I was able to work as a consultant, architect, and troubleshooter while also able to work as a coder.

When I look at OpenStack's current state of affairs I see a system that could truly benefit from some deeper insight into the assumptions the technologies used in its development make. For example, OpenStack makes assumptions about Python which lead to a set of assumptions about its supporting C libraries. These two tiers of assumptions relate to real world operational behaviors that only cloud administrators would ever really have to come to terms with in real practice.

One example is the SSL certificate validation behaviors I wrote about previously. OpenStack assumes that Python will use a trusting model when accepting SSL certificates by default. This is built on assumptions about the C libraries that Python sits on top of. Some of these C libraries in turn have assumptions about the environment variables on the OS they sit on top of. Those assumptions might include hidden variables (hidden from the Python scripts' perspective) that toggle behavior in ways that the cloud administrator might be surprised at.

Another example of OpenStack making assumptions that may not hold is here in a patch (see link) I've submitted for Nova. The Python makes assumptions about how the OS will expose it's mount points. These are specific assumptions made for Linux which currently exposes mount points via a file system interface. The technique used in Linux is nice matched to the Unix Philosophy but it is a specific assumption that limits OpenStack Nova to only functioning on Linux distributions that honor this convention (I know of none that do not honor this right now, but some of us remember the 1990's and how divergent some Linux distros could get.)

As a Java developer and Java architect understanding the JVM and how it interacted with Hypervisors made me better at understanding how my PaaS hosted applications would scale on top of the IaaS infrastructure they were hosted on. In OpenStack, I'm looking to find the same insights to make Nova better. That's because Nothing is Virtual.

So what I'm looking for now in my OpenStack work are the ways in which the hypothetical and virtual components interact with what's really real. This includes the details of how the various Nova hypervisors interact with their real hardware, what assumptions they make about the real infrastructure and how those assumptions affect scale and fitness of the platform.

This is important because nothing is virtual and IaaS suffers from a lack of understanding this principle as much as anything else.

Saturday, December 21, 2013

The one about SSL and signed certificates

Greetings Stackers!

Here's an example of a tiny thing that can ruin your day if you aren't paying close attention.

in nova.conf the option ssl.ca_file switches on strict SSL certificate checking

Suppose you have signed certs everywhere except for one place. Well, you're gonna have a bad day, 'cuz you can only pick *all* certs need to be signed by a CA or only *no* certs need to be signed with a CA in Nova. So, just be aware.

I've also found this example https://gist.github.com/ssbarnea/8007689 (backup here) that shows how setting global environment variables can change the behavior of libraries underneath Python without touching anything in Python itself. That means you can change Python's strictness about SSL globally without necessarily exposing any visible control anywhere within OpenStack's configurations.

You've been warned.

Keyword spam to see if this post can get into a search engine:

Verify return code: 21 (unable to verify the first certificate)
nova-compute
vCenter unsigned SSL certificate

Tuesday, December 3, 2013

clouds bleeding context

In OpenStack havana we had a bug that did not affect functionality, but it bothered me a great deal. In OpenStack grizzly when the Nova Compute driver for VMware's management API would create a new instance (aka virtual machine) it would name the VM in vSphere's API based on the pattern instance-000001 where the number portion of the instance's name was the row number that the instance record was represented by in the Nova database.

There's several really glaring problems here. First, it violates Nova rather blatantly. We were taking a value that you could never utilize in the exposed Nova API and flagrantly exposing it to the world. That's a major violation of Nova's internal implementation details.

Fortunately in Havana we managed to solve this problem and the vSphere name is based on a value that is exposed by Nova. The UUID for the instance is now the vSphere name. We kept the vestigial code that did lookups based on the pattern /instance-\d+/ around just for people doing Nova upgrades. The result is that at least Nova's internals don't bleed into vSphere.

The bug I've filed today is related to the fact that in vSphere the name attribute of a guest virtual machine is not immutable. If a VI Admin edits the instance name with the initially released Havana version of the driver that admin can break their OpenStack automation. Ideally, this should never be a problem since the Nova Compute manager should be able to accomplish all the tasks that a VM administrator or tenant need, but in practice this is likely to be a sore spot with hybrid cloud administrators working in cloud environments that are both OpenStack and vSphere and managed simultaneously through tools like Power CLI or RVC which will allow modification of the name.

Hopefully we'll get a fix that can be applied that will make the linkage between Nova Compute and vSphere's admin tools more robust but until then, there are cloud details that bleed through each layer and that's a sign of possible trouble in design, implementation, and administration.

I'm just trying to watch out for that admin with too many hats to wear.