All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors, or our affiliates.
  Add to My Yahoo!  Subscribe with Bloglines  Subscribe in NewsGator Online

published by noreply@blogger.com (Spencer Christensen) on 2014-08-25 14:54:00 in the "Camps" category

For most web developers, you have practices and tools that you are used to using to do your work. And for most web developers this means setting up your workstation with all the things you need to do your editing, compiling, testing, and pushing code to some place for sharing or deployment. This is a very common practice even though it is fraught with problems- like getting a database setup properly, configuring a web server, any other services (memcached, redis, mongodb, etc), and many more issues.

Hopefully at some point you realize the pain that is involved in doing everything on your workstation directly and start looking for a better way to do web development. In this post I will be looking at some ways to do this better: using a virtual machine (VM), Vagrant, and DevCamps.

Using a VM for development

One way to improve things is to use a local virtual machine for your development (for example, using VirtualBox, or VMware Fusion). You can edit your code normally on your workstation, but then execute and test it in the VM. This also makes your workstation "clean", moving all those dependencies (like a database, web server, etc.) off your workstation and into the VM. It also gets your dev environment closer to production, if not identical. Sounds nice, but let's break down the pros and cons.

Benefits of using a VM
  • Dev environment closely matches production.
  • Execute and test code in a dedicated machine (not your workstation directly).
  • Allows for multiple projects to be worked on concurrently (one VM per project).
  • Exposes the developer to the Operations (systems administration) side of the web application (always a good thing).
  • Developer can edit files using their favorite text editor locally on the workstation (but will need to copy files to the VM as needed).
Problems with using a VM
  • Need to create and configure the VM. This could be very time consuming and error prone.
  • Still need to install and configure all services and packages. This could also be time consuming and error prone.
  • Backups of your work/configuration/everything are your own responsibility (extremely unlikely to happen).
  • Access to your dev environment is extremely limited, thus probably only you can access it and test things on it. No way for a QA engineer or business owner to test/demo your work.
  • Inexperienced developers can break things, or change them to no longer match production (install arbitrary packages, different versions than what is in production, screw up the db, screw up Apache configuration, etc.).
  • If working with an established database, then downloading a dump, installing, and getting the database usable is time consuming and error prone. ("I just broke my dev database!" can be a complete blocker for development.)
  • The developer needs to set up networking for the VM in order to ssh to it, copy files back and forth, and point a web browser to it. This may include manually setting up DNS, or /etc/hosts entries, or port forwarding, or more complex setups.
  • If using SSL with the web application, then the developer also needs to generate and install the SSL cert and configure the web server correctly.

Vagrant

What is Vagrant? It is a set of tools to make it easier to use a virtual machine for your web development. It attempts to lessen many of the problems listed above through the use of automation. By design it also makes some assumptions about how you are using the VM. For example, it assumes that you have the source code for you project in a directory somewhere directly on your workstation and would prefer to use your favorite text editor on those files. Instead of expecting you to continually push updated files to your VM, it sets up a corresponding directory on the VM and keeps the two in sync for you (using either shared folders, NFS, Samba, or rsync). It also sets up the networking for accessing the VM, usually with port forwarding, so you don't have to worry about that.

Benefits of Vagrant
  • Same as those listed above for using a VM, plus...
  • Flexible configuration (Vagrantfile) for creating and configuring the VM.
  • Automated networking for the VM with port forwarding. Abstracted ssh access (don't need to set up a hostname for the VM, simply type `vagrant ssh` to connect). Port forwarded browser access to the VM (usually http://localhost:8080, but configurable).
  • Sync'd directory between your workstation and the VM for source code. Allows for developers to use their favorite text editor locally on their workstation without needing to manually copy files to the VM.
  • Expects the use of a configuration management system (like puppet, chef, salt, or bash scripts) to "provision" the VM (which could help with proper and consistent setup).
  • Through the use of Vagrant Cloud you can get a generated url for others to access your VM (makes it publicly available through a tunnel created with the command `vagrant share`).
  • Configuration (Vagrantfile and puppet/chef/salt/etc.) files can be maintained/reviewed by Operations engineers for consistency with production.
Problems with Vagrant
  • Still need to install and configure all services and packages. This is lessened with the use of a configuration management tool like puppet, but you still need to create/debug/maintain the puppet configuration and setup.
  • Backups of your work/configuration/everything are your own responsibility (extremely unlikely to happen). This may be lessened for VM configuration files, assuming they are included in your project's VCS repo along with your source code.
  • Inexperienced developers can still break things, or change them to no longer match production (install arbitrary packages, different versions than what is in production, screw up the db, screw up Apache configuration, etc.).
  • If working with an established database, then downloading a dump, installing, and getting the database usable is time consuming and error prone. ("I just broke my dev database!" can be a complete blocker for development.)
  • If using SSL with the web application, then the developer also needs to generate and install the SSL cert and configure the web server correctly. This might be lessened if puppet (or whatever) is configured to manage this for you (but then you need to configure puppet to do that).

DevCamps

The DevCamps system takes a different approach. Instead of using VMs for development, it utilizes a shared server for all development. Each developer has their own account on the camps server and can create/update/delete "camps" (which are self-contained environments with all the parts needed). There is an initial setup for using camps which needs thorough understanding of the web application and all of its dependencies (OS, packages, services, etc.). For each camp, the system will create a directory for the user with everything related to that camp in it, including the web application source code, their own web server configuration, their own database with its own configuration, and any other resources. Each camp is assigned a camp number, and all services for that camp run on different ports (based on the camp number). For example, camp 12 may have Apache running on ports 9012 (HTTP) and 9112 (HTTPS) and MySQL running on port 8912. The developer doesn't need to know these ports, as tools allow for easier access to the needed services (commands like `mkcamp`, `re` for restarting services, `mysql_camp` for access to the database, etc.).

DevCamps has been designed to address some of the pain usually associated with development environments. Developers usually do not need to install anything, since all dependencies should already be installed on the camps server (which should be maintained by an Operations engineer who can keep the packages, versions, etc. consistent with production). Having all development on a server allows Operations engineers to backup all dev work fairly easily. Databases do not need to be downloaded, manually setup, or anything- they should be set up initially with the camps system and then running `mkcamp` clones the database and sets it up for you. Running `refresh-camp --db` allows a developer to delete their camp's database and get a fresh clone, ready to use.

Benefits of DevCamps
  • Each developer can create/delete camps as needed, allowing for multiple camps at once and multiple projects at once.
  • Operations engineers can manage/maintain all dependencies for development, ensuring everything is consistent with production.
  • Backups of all dev work is easy (Operations engineer just needs to backup the camps server).
  • Developer does not need to configure services (camp templating system auto-generates needed configuration for proper port numbers), such as Apache, nginx, unicorn, MySQL, Postgres, etc.
  • SSL certificates can be easily shared/generated/installed/etc. automatically with the `mkcamp` script. Dev environments can easily have HTTPS without the developer doing anything.
  • Developers should not have permissions to install/change system packages or services. Thus inexperienced developers should not be able to break the server, other developer's environments, install arbitrary software. Screwing up their database or web server config can be fixed by either creating a new camp, refreshing their existing one, or an Operations engineer can easily fix it for them (since it is on a central server they would already have access to, and not need to worry about how to access some VM who knows where).
Problems with DevCamps
  • Since all camps live on a shared server running on different ports, this will not closely match production in that way. However, this may not be significant if nearly everything else does closely match production.
  • Adding a new dependency (for example, adding mongodb, or upgrading the version of Apache) may require quite a bit of effort and will affect all camps on the server- Operations engineer will need to install the needed packages and add/change the needed configuration to the camps system and templates.
  • Using your favorite text editor locally on your workstation doesn't really work since all code lives on the server. It is possible to SFTP files back and forth, but this can be tedious and error prone.
  • Many aspects of the Operations (systems administration) side of the web application are hidden from the developer (this might also be considered a benefit).
  • All development is on a single server, which may be a single point of failure (if the camps server is down, then all development is blocked for all developers).
  • One camp can use up more CPU/RAM/disk/etc. then others and affect the server's load, affecting the performance of all other camps.

Concluding Thoughts

It seems that Vagrant and DevCamps certainly have some good things going for them. I think it might be worth some thought and effort to try to meld the two together somehow, to take the benefits of both and reduce the problems as much as possible. Such a system might look like this:

  • Utilize vagrant commands and configuration, but have all VMs live on a central VM server. Thus allowing for central backups and access.
  • Source code and configuration lives on the server/VM but a sync'd directory is set up (sshfs mount point?) to allow for local editing of text files on the workstation.
  • VMs created should have restricted access, preventing developers from installing arbitrary packages, versions, screwing up the db, etc.
  • Configuration for services (database, web server, etc.) should be generated/managed by Operations engineers for consistency (utilizing puppet/chef/salt/etc.).
  • Databases should be cloned from a local copy on the VM server, thus avoiding the need to download anything and reducing setup effort.
  • SSL certs should be copied/generated locally on the VM server and installed as appropriate.
  • Sharing access to a VM should not depend on Vagrant Cloud, but instead should use some sort of internal service on the VM server to automate VM hostname/DNS for browser and ssh access to the VM.

I'm sure there are more pros and cons that I've missed. Add your thoughts to the comments below. Thanks.


published by noreply@blogger.com (Dave Jenkins) on 2014-08-22 14:21:00 in the "Education" category

This past week, End Point had the distinct pleasure of sending a Liquid Galaxy Express (the highly portable version of the platform) to the Daniel Island School in Charleston, South Carolina. Once it arrived, we provided remote support to their staff setting up the system. Through the generous donations of Mason Holland, Benefitfocus, and other donors, this PK-8 grade school is now the first school in the country below the university level with a Liquid Galaxy on campus.

From Claire Silanowicz, who coordinated the installation:

Mason Holland was introduced to the Liquid Galaxy system while visiting the Google Headquarters in San Francisco several months ago. After deciding to donate it to the Daniel Island School here in Charleston, SC, he brought me on to help with the project. I didn't know much about the Liquid Galaxy at first, but quickly realized how cool of a project this was going to be. With some help, I assembled a team of about 8 Benefitfocus employees to help with installation and long-term implementation. Benefitfocus is full of employees who are so passionate about innovative technology, and Mason's involvement with Benefitfocus was a perfect way to connect the company to the community. We had one meeting before the installation date to go over the basics and a few days later 5 of us were at the school unpacking boxes and assembling the 7-screen display. Once it was completed and turned on, we were all in awe. We went from the Golden Gate Bridge to the Duomo in Florence, Italy in a matter of seconds. We traveled to our homes and went to see our office building on street view. After going back to the school in the days to follow, I realized we only touched the tip of the iceberg. The faculty at the school had discovered the museums and underwater views that Google has managed to capture.

The Liquid Galaxy isn't known for revolutionizing the way children learn, but I firmly believe it is going to do just that at the Daniel Island School. The teachers and faculty are so excited to incorporate this new technology into their curriculae. They have a unique opportunity to take this technology and make it an integral part of their teaching. I hope that in the future, other elementary and middle schools can have the Liquid Galaxy system so that teachers all over the country can collaborate and take advantage of everything it has to offer!

STEM education is becoming ever-more important in the fast economy of the 21st century. With a Liquid Galaxy these young students are exposed at a very early age to the wonders of geography, geology, urban development, oceanography, and demographics, not to mention the technological wonderment the platform itself invokes with the young minds: with seven 55" screens mounted on an arced frame, a touchscreen podium and a rack of computers nearby, the Liquid Galaxy is a visually impressive piece of technology regardless of what is being shown on the screens.

This installation is another in a string of academic and educational deployments for the platform. End Point provides 24-hour monitoring and remote support for Liquid Galaxies at Westfield University in Massachusetts, University of Kansas, The National Air & Space Museum in Washington DC, and the Oceanographic Museum in Monaco and a host of other educational institutions. We also work closely with researchers at Lleida campus in Spain and the University of Western Sydney in Australia. We know of other Liquid Galaxies on campuses in Colorado, Georgia, Oklahoma, and Israel.

We expect great things from these students, and hope that some may eventually join us at End Point as developers!


published by noreply@blogger.com (Matt Galvin) on 2014-08-20 22:09:00 in the "active shipping gem" category

Hello again all. I was working on another Spree Commerce Site, a Ruby on Rails based e-commerce platform. As many of you know, Spree Commerce comes with Promotions. According to Spree Commerce documentation, Spree Commerce Promotions are:

"... used to provide discounts to orders, as well as to add potential additional items at no extra cost. Promotions are one of the most complex areas within Spree, as there are a large number of moving parts to consider."

The promotions feature can be used to offer discounts like free shipping, buy one get one free etc.. The client on this particular project had asked for the ability to provide a coupon for free shipping. Presumably this would be a quick and easy addition since these types of promotions are included in Spree.

The site in question makes use of Spree's Active Shipping Gem, and plugs in the UPS Shipping API to return accurate and timely shipping prices with the UPS carrier.

The client offers a variety of shipping methods including Flat Rate Ground, Second Day Air, 3 Day Select, and Next Day Air. Often, Next Day Air shipping costs several times more than Ground. E.g.: If something costs $20 to ship Ground, it could easily cost around $130 to ship Next Day Air.

When creating a free shipping Promotion in Spree it?s important to understand that by default it will be applied to all shipping methods. In this case, the customer could place a small order, apply the coupon and receive free Next Day Air shipping! To take care of this you need to use Promotion Rules. Spree comes with several built-in rules:

  • First Order: The user?s order is their first.
  • ItemTotal: The order?s total is greater than (or equal to) a given value.
  • Product: An order contains a specific product.
  • User: The order is by a specific user.
  • UserLoggedIn: The user is logged in.

As you can see there is no built in Promotion Rule to limit the free shipping to certain shipping methods. But fear not, it?s possible to create a custom rule.

module Spree
     class Promotion
       module Rules
         class RestrictFreeShipping < PromotionRule
           MATCH_POLICIES = %w(all)
  
           def eligible?(order, options={})
             e = false
             if order.shipment.shipping_method.admin_name == "UPS Flat Rate Ground"
               e = true
             else
               e = false
             end
            return e
           end
        end
      end
    end
  end

Note that you have to create a partial for the rule, as per the documentation.

Then, in config/locales/en.yml I added a name and description for the rule.

en:
     spree:
       promotion_rule_types:
         restrict_free_shipping:
           name: Restrict Free Shipping To Ground
           description: If somebody uses a free shipping coupon it should only apply to ground shipping

The last step was to restart the app and configure the promotion in the Spree Admin interface.


published by noreply@blogger.com (Jon Jensen) on 2014-08-19 21:45:00 in the "email" category

On a Debian GNU/Linux 7 ("wheezy") system with both IPv6 and IPv4 networking setup, running Postfix 2.9.6 as an SMTP server, we ran into a mildly perplexing situation. The mail logs showed that outgoing mail to MX servers we know have IPv6 addresses, the IPv6 address was only being used occasionally, while the IPv4 address was being used often. We expected it to always use IPv6 unless there was some problem, and that's been our experience on other mail servers.

At first we suspected some kind of flaky IPv6 setup on this host, but that turned out not to be the case. The MX servers themselves are fine using only IPv6. In the end, it turned out to be a Postfix configuration option called smtp_address_preference:

smtp_address_preference (default: any)

The address type ("ipv6", "ipv4" or "any") that the Postfix SMTP client will try first, when a destination has IPv6 and IPv4 addresses with equal MX preference. This feature has no effect unless the inet_protocols setting enables both IPv4 and IPv6. With Postfix 2.8 the default is "ipv6".

Notes for mail delivery between sites that have both IPv4 and IPv6 connectivity:

The setting "smtp_address_preference = ipv6" is unsafe. It can fail to deliver mail when there is an outage that affects IPv6, while the destination is still reachable over IPv4.

The setting "smtp_address_preference = any" is safe. With this, mail will eventually be delivered even if there is an outage that affects IPv6 or IPv4, as long as it does not affect both.

This feature is available in Postfix 2.8 and later.

That documentation made it sound as if the default had changed to "ipv6" in Postfix 2.8, but at least on Debian 7 with Postfix 2.9, it was still defaulting to "any", thus effectively randomly choosing between IPv4 and IPv6 on outbound SMTP connections where the MX record pointed to both.

Changing the option to "ipv6" made Postfix behave as expected.


published by noreply@blogger.com (Greg Sabino Mullane) on 2014-08-16 18:11:00 in the "laptop" category

I recently upgraded my main laptop to Ubuntu 14.04, and had to solve a few issues along the way. Ubuntu is probably the most popular Linux distribution. Although it is never my first choice (that would be FreeBSD or Red Hat), Ubuntu is superb at working "out of the box", so I often end up using it, as the other distributions all have issues.

Ubuntu 14.04.1 is a major "LTS" version, where LTS is "long term support". The download page states that 14.04 (aka "Trusty Tahr") comes with "five years of security and maintenance updates, guaranteed." Alas, the page fails to mention the release date, which was July 24, 2014. When a new version of Ubuntu comes out, the OS will keep nagging you until you upgrade. I finally found a block of time in which I could survive without my laptop, and started the upgrade process. It took a little longer than I thought it would, but went smoothly except for one issue:

Issue 1: xscreensaver

During the install, the following warning appeared:

"
"One or more running instances of xscreensaver or xlockmore have been detected on this system. Because of incompatible library changes, the upgrade of the GNU libc library will leave you unable to authenticate to these programs. You should arrange for these programs to be restarted or stopped before continuing this upgrade, to avoid locking your users out of their current sessions."
"

First, this is a terrible message. I'm sure it has caused lots of confusion, as most users probably do not know what what xscreensaver and xlockmore are. Is it so hard for the installer to tell which one is in use? Why in the world can the installer not simply stop these programs itself?! The solution was simple enough: in a terminal, I ran:

pgrep -l screensaver
pkill screensaver
pgrep -l screensaver

The first command was to see if I had any programs running with "screensaver" in their name (I did: xscreensaver). As it was the only program that matched, it was safe to run the second command, which stopped xscreensaver. Finally, I re-ran the pgrep to make sure it was stopped and gone. Then I did the same thing with the string "lockmore" (which found no matches, as I expected). Once xscreensaver was turned off, I told the upgrade to continue, and had no more problems until after Ubuntu 14.04 was installed and running. The first post-install problem appeared after I suspended the computer and brought it back to life - no wireless network!

Issue 2: no wireless after suspend

Once suspended and revived, the wireless would simply not work. Everything looked normal: networking was enabled, wifi hotspots were detected, but a connection could simply not be made. After going through bug reports online and verifying the sanity of the output of commands such as "nmcli nm" and "lshw -C network", I found a solution. This was the hardest issue to solve, as it had no intuitive solution, nothing definitive online, and was full of red herrings. What worked for me was to *remove* the suspension of the iwlwifi module. I commented out the line from /etc/pm/config.d/modules, in case I ever need it again, so the file now looks like this:

# SUSPEND_MODULES="iwlwifi"

Once that was commented out, everything worked fine. I tested by doing sudo pm-suspend from the command-line, and then bringing the computer back up and watching it automatically reconnect to my local wifi.

Issue 3: color diffs in git

I use the command-line a lot, and a day never goes by without heavy use of git as well. On running a "git diff" in the new Ubuntu version, I was surprised to see a bunch of escape codes instead of the usual pretty colors I was used to:

ESC[1mdiff --git a/t/03dbmethod.t b/t/03dbmethod.tESC[m
ESC[1mindex 108e0c5..ffcab48 100644ESC[m
ESC[1m--- a/t/03dbmethod.tESC[m
ESC[1m+++ b/t/03dbmethod.tESC[m
ESC[36m@@ -26,7 +26,7 @@ESC[m ESC[mmy $dbh = connect_database();ESC[m
 if (! $dbh) {ESC[m
    plan skip_all => 'Connection to database failed, cannot continue testing';ESC[m
 }ESC[m
ESC[31m-plan tests => 543;ESC[m
ESC[32m+ESC[mESC[32mplan tests => 545;ESC[m

After poking around with terminal settings and the like, a coworker suggested I simply tell git to use an intelligent pager with the command git config --global core.pager "less -r". The output immediately improved:

diff --git a/t/03dbmethod.t b/t/03dbmethod.t
index 108e0c5..ffcab48 100644
--- a/t/03dbmethod.t
+++ b/t/03dbmethod.t
@@ -26,7 +26,7 @@ my $dbh = connect_database();
 if (! $dbh) {
    plan skip_all => 'Connection to database failed, cannot continue testing';
 }
-plan tests => 543;
+plan tests => 545;

Thanks Josh Williams! The above fix worked perfectly. I'm a little unsure of this solution as I think the terminal and not git is really to blame, but it works for me and I've seen no other terminal issues yet.

Issue 4: cannot select text in emacs

The top three programs I use every day are ssh, git, and emacs. While trying (post-upgrade) to reply to an email inside mutt, I found that I could not select text in emacs using ctrl-space. This is a critical problem, as this is an extremely important feature to lose in emacs. This problem was pretty easy to track down. The program "ibus" was intercepting all ctrl-space calls for its own purpose. I have no idea why ctrl-space was chosen, being used by emacs since before Ubuntu was even born (the technical term for this is "crappy default"). Fixing it requires visiting the ibus-setup program. You can reach it via the system menu by going to Settings Manager, then scroll down to the "Other" section and find "Keyboard Input Methods". Or you can simply run ibus-setup from your terminal (no sudo needed).


The ibus-setup window

However you get there, you will see a section labelled "Keyboard Shortcuts". There you will see a "Next input method:" text box, with the inside of it containing <Control>space. Aha! Click on the three-dot button to the right of it, and change it to something more sensible. I decided to simply add an "Alt", such that going to the next input method will require Ctrl-Alt-Space rather than Ctrl-Space. To make that change, just select the "Alt" checkbox, click "Apply", click "Ok", and verify that the text box now says <Control><Alt>space.

So far, those are the only issues I have encountered using Ubuntu 14.04. Hopefully this post is useful to someone running into the same problems. Perhaps I will need to refer back to it in a few years(?) when I upgrade Ubuntu again! :)


published by Eugenia on 2014-08-05 02:47:41 in the "Recipes" category
Eugenia Loli-Queru

Oopsies are the Americanized version of the French souffle. My French husband loved them. They can be baked in ramekins for a more authentic souffle taste (in this case omit the almond flour), or as bread buns. They’re extremely low carb, and Paleo/Primal.

Ingredients (makes 6 buns)
* 4 eggs, yolks and whites separated in two bowls
* 3/4 cup of creamy goat cheese, or shaved emmental cheese
* 2 tablespoons of almond or coconut flour
* 1/2 teaspoon of cream of tartar (or baking soda)

Method
1. Preheat oven to 350 F (175 C). On the bowl with the whites, add the cream of tartar.
2. Beat the whites in high speed until very-very stiff, about 4-5 minutes.
3. Add the cheese and flour to the yolk bowl, and beat until smooth, about 1-2 minutes.
4. Fold the yolk mixture slowly into the whites, and mix carefully with a spatula for a few seconds.
5. Spoon the mixture in 6 pieces, on a baking sheet with a parchment paper. Bake for 15-20 minutes until golden brown. Serve immediately.

Per Serving (3 buns): 430 calories, 3 gr of net carbs, 36 gr of fat, 25% protein, 83% Lysine. 45% B12, 72% Riboflavin, 63% choline, 55% A, 23% calcium, 59% phosphorus, 31% selenium, 33% copper.


published by noreply@blogger.com (Josh Ausborne) on 2014-08-01 17:08:00

For our Liquid Galaxy installations, we use a master computer known as a "head node" and a set of slave computers known as "display nodes." The slave computers all PXE-boot from the head node, which directs them to boot from a specific ISO disk image.

In general, this system works great. We connect to the head node and from there can communicate with the display nodes. We can boot them, change their ISO, and do all sorts of other maintenance tasks.

There are two main settings that we change in the BIOS to make things run smoothly. First is that we set the machine to power on when AC power is restored. Second, we set the machine's boot priority to use the network.

Occasionally, though, the CMOS battery has an issue, and the BIOS settings get lost.  How do we get in and boot the machine up? This is where ipmitool has really become quite handy.

Today we had a problem with one display node at one of our sites. It seems that all of the machines in the Liquid Galaxy were rebooted, or otherwise powered off and then back on. One of them just didn't come up, and it was causing me much grief.  We have used ipmitool in the past to be able to help us administer the machines.

IPMI stands for Intelligent Platform Media Interface, and it gives the administrator some non-operating system level access to the machine.  Most vendors have some sort of management interface (HP's iLO, Dell's DRAC), including our Asus motherboards.  The open source ipmitool is the tool we use on our Linux systems to be able to interface with the IPMI module on the motherboard.

I connected to the head node and ran the following command and got the following output:

    admin@headnode:~ ipmitool -H 10.42.41.33 -I lanplus -P 'xxxxxx' chassis status
    System Power         : off
    Power Overload       : false
    Power Interlock      : inactive
    Main Power Fault     : false
    Power Control Fault  : false
    Power Restore Policy : always-off
    Last Power Event     : ac-failed
    Chassis Intrusion    : inactive
    Front-Panel Lockout  : inactive
    Drive Fault          : false
    Cooling/Fan Fault    : true

While Asus's Linux support is pretty lacking, and most of the options we find here don't work with with the open source ipmitool, we did find "System Power : off" in the output, which is a pretty good indicator of our problem.  This tells me that the BIOS settings have been lost for some reason, as we had previously set the system to power on when AC power was restored.  I ran the following to tell it to boot into the BIOS, then powered on the machine:

    admin@headnode:~ ipmitool -H 10.42.41.33 -I lanplus -P 'xxxxxx' chassis bootdev bios
    admin@headnode:~ ipmitool -H 10.42.41.33 -I lanplus -P 'xxxxxx' chassis power on

At this point, the machine is ready for me to be able to access the BIOS through a terminal window. I opened a new terminal window and typed the following:

    admin@headnode:~ ipmitool -H ipmi-lg2-3 -U admin -I lanplus sol activate
    Password:

After typing in the password, I get the ever-helpful dialog below:

    [SOL Session operational.  Use ~? for help]

I didn't bother with the ~? because I knew that the BIOS would eventually just show up in my terminal. There are, however, other commands that pressing ~? would show.

See, look at this terminal version of the BIOS that we all know and love!



Now that the BIOS was up, it's as if I was really right in front of the computer typing on a keyboard attached to it. I was able to get in and change the settings for the APM, so that the system will power on upon restoration of AC power. I also verified that the machine is set to boot from the network port before saving changes and exiting. The next thing I knew, the system was booting up PXE, which then pointed it to the proper ISO, and then it was all the way up and running.

And this, my friends, is why systems should have IPMI. I state the obvious here when I say that life as a system administrator is so much easier when one can get into the BIOS on a remote system.


published by noreply@blogger.com (Josh Williams) on 2014-08-01 03:01:00 in the "Conference" category
Just got back from PyOhio a couple of days ago. Columbus used to be my old stomping grounds so it's often nice to get back there. And PyOhio had been on my TODO for a number of years now, but every time it seemed like something else just got in the way. This year I figured it was finally time, and I'm quite glad it worked out.

While of course everything centered around usage with Python, much of the talks surrounded other tools or projects. I return with a much better view of good technologies likes Redis, Ansible, Docker, ØMQ, Kafka, Celery, asyncio in Python 3.4, Graphite, and much more that isn't coming to mind at the moment. I have lots to dig into now.

It also pleased me to see so much Postgres love! I mean, clearly, once you start to use it you'll fall in love, that's without question. But the hall track was full of conversations about how various people were using Postgres, what it tied in to in their applications, and various tips and tricks they'd discovered in its functionality. Just goes to prove that Postgres == ?.

Naturally PostgreSQL is what I spoke on; PL/Python, specifically. It actually directly followed a talk on PostgreSQL's LISTEN/NOTIFY feature. I was a touch worried about overlap considering some of the things I'd planned, but it turns out the two talks more or less dovetailed from one to the other. It was unintentional, but it worked out very well.

Anyway, the slides are available, but the talk wasn't quite structured in the normal format of having those slides displayed on a projector. Instead, in a bit of an experiment, the attendees could hit a web page and bring up the slides on their laptops or such. That slide deck opened a long-polling socket back to the server, and the web app could control the slide movement on those remote screens. That let the projector keep to a console session that was used to illustrate PL/Python and PostgreSQL usage. As you might expect, the demo included a run through the PL/Python and related code that drove that socket. Hopefully the video, when it's available, caught some of it.

The sessions were recorded on video, but one thing I hadn't expected was how that influenced which talks I tried to attend. Knowing that the more software-oriented presentations will be available for viewing later, where available I opted for more hardware-oriented topics, or other talks where being present seemed like it would have much more impact. I also didn't feel rushed between sessions, on the occasions where I got caught in a hall track conversation or checked out something in the open spaces area (in one sense,a dedicated hall track room.)

Overall, it was a fantastic conference and a great thank you goes out to everyone that helped make it happen!

published by noreply@blogger.com (Joshua Tolley) on 2014-07-31 22:29:00 in the "graphics" category

Image by Stoermerjp, unmodified (CC BY-SA 3.0)

The Liquid Galaxy began as a system to display geographic data through Google Earth, but it has expanded quickly as a display platform for other types of information. We've used Liquid Galaxies for panoramic images and video, three dimensional models of all sorts, as well as time-lapse renderings of weather, infrastructure, and economic data. Now we've added support for a new type of data, the point cloud.

"Point cloud" is simply the common term for a data set consisting of individual points, often in three-dimensional space, and frequently very large, containing thousands or millions of entries. Points in a cloud can include not just coordinate data, but other information as well, and because this sort of data can come from many different fields, the possible variations are endless. For instance, the terrain features visible in Google Earth began life as point clouds, the output of an aerial scanning device such as a LIDAR scanner. These scanners sweep their field of view rapidly, scanning millions of points to determine their location and any other interesting measurements -- color, for instance, or temperature -- and with that data create a point cloud. Smaller scale hardware scanners have made their way into modern life, scanning rooms and buildings, or complex objects. A few years ago, the band Radiohead collaborated with Google to use the 3-D scanning techniques to film a music video, and published the resulting point cloud on Google Code.

Image by Xorx, unmodified (CC BY-SA 3.0)

For the Liquid Galaxy platform, we modified an open source point cloud viewer called Potree to allow one master instance to control several others. Potree is a WebGL application. It runs in a browser, and depends on three.js, a widely used WebGL library. Generally speaking, to convert an application to run on a Liquid Galaxy, the developer must give the software two separate modes: one which allows normal user control and transmits information to a central data bus, and another which receives information from the bus and uses it to create its own modified display. In this case, where the application simply loads a model and lets the user move it around, the "master" mode tracks all camera movements and sends them to the other Liquid Galaxy nodes, which will draw the same point cloud in the same orientation as the master, but with the camera pointing offset to the left or right a certain amount. We've dubbed our version lg-potree.

This marks the debut of a simple three.js extension we've been working on, which we've called lg-three.js, designed to make it easy to adapt three.js applications for the Liquid Galaxy. lg-three.js gives the programmer an interface for capturing and serializing things like camera movements or other scene changes on the master node, and de-serializing and using that data on the other nodes, hopefully without having to modify the original application much. Some applications' structure doesn't lend itself well to lg-three, but potree proved fairly straightforward.

With that introduction, please enjoy this demonstration.


published by noreply@blogger.com (Steph Skardal) on 2014-07-29 20:06:00 in the "javascript" category

A while back, we started using the Nestable jQuery Plugin for H2O. It provides interactive hierarchical list functionality – or the ability to sort and nest items.


Diagram from Nestable jQuery Plugin representing interactive hierarchical sort and list functionality.

I touched on H2O's data model in this post, but it essentially mimics the diagram above; A user can build sortable and nestable lists. Nesting is visible at up to 4 levels. Each list is accessible and editable as its own resource, owned by a single user. The plugin is ideal for working with the data model, however, I needed a bit of customization that I'll describe in this post.

Limiting Nestability to Specific List Items

The feature I was requested to develop was to limit nesting to items owned by the current authorized (logged in) user. Users can add items to their list that are owned by other users, but they can not modify the list elements for that list item. In visual form, it might look something like the diagram below, where green represents the items owned by the user which allow modified nesting, and red represents items that are not owned by the user which can not be modified. In the diagram below, I would not be able to add to or reorder the contents of Item 5 (including Items 6 - 8), and I would not be able to add any nested elements to Item 10. I can, however, reorder elements inside Item 2, which means e.g. I can move Item 5 above Item 3.


Example of nesting where nesting is prohibited among some items (red), but allowed under others (green).

There are a couple of tricky bits to note in developing this functionality:

  • The plugin doesn't support this type of functionality, nor is it currently maintained, so there are absolutely no expectations of this being an included feature.
  • These pages are fully cached for performance optimization, so there is no per-user logic that can be run to generate modified HTML. The solution here was implemented using JavaScript and CSS.

Background Notes on the Plugin

There are a couple of background notes on the plugin before I go into the implemented solution:

  • The plugin uses <ol> tags to represent lists. Only items in <ol> elements are nestable and sortable.
  • The plugin recognizes .dd-handle as the draggable handle on a list item. If an item doesn't have a .dd-handle element, no part of it can be dragged.
  • The plugin creates a <div> with a class of dd-placeholder to represent the placeholder where an item is about to be dropped. The default appearance for this is a white box with dashed outline.
  • The plugin has an on change event which is triggered whenever any item is dropped in the list or any part of the list is reordered.

Step 1: JavaScript Update for Limiting Nestability

After the content loads, as well as after additional list items are added, a method called set_nestability is run to modify the HTML of the content, represented by the pseudocode below:

set_nestability: function() {
  // if user is not logged in
    // nestability is never enabled
  // if user is logged in user is not a superadmin (superadmins can edit all) 
    // loop through each list item 
      // if the list item data('user_id') != $.cookie('user_id')
        // remove dd-handle class from all list item children .dd-handle elements
        // replace all <ol> tags with <ul> tags inside that list item

The simple bit of pseudocode does two things: It removes the .dd-handle class for elements that can't be reordered, and it replaces <ol> tags with <ul> tags to enable CSS. The only thing to take note of here is that in theory, a "hacker" can change their cookie to enable nestability of certain items, but there is additional server-side logic that would prohibit an update.

Step 2: CSS Updates

ul .dd-placeholder {
  display: none;
}

Next up, the simple CSS change above was made. This results in the placeholder div being hidden in any non-editable list. I made several small CSS modifications so that the ul and ol elements would look the same otherwise.

Step 3: JavaScript On Change Updates

Finally, I modified the on change event:

$('div#list').on('change', function(el) {
  // if item is dropped
    // if dropped item is not inside <ol> tag, return (do nothing)
    // else continue with dropped logic
  // else if position is changed
    // trigger position change logic  
}

The on change event does nothing when the dropped item is not inside an editable list. Otherwise, the dropped logic continues as the user has permissions to edit the list.

Conclusion

The functionality described here has worked well. What may become a bit trickier is when advanced rights and roles will allow non-owners to edit specific content, which I haven't gotten to yet. I haven't found additional resources that offer sortable and nestable functionality in jQuery, but it'd be great to see a new well-supported plugin in the future.


published by Eugenia on 2014-07-29 00:09:53 in the "General" category
Eugenia Loli-Queru

Let’s assume you shipwreck on a deserted island (knock wood). Somehow, half-buried in the sand you find a magic box with a message in it. The message asks you to specify 10 foods that the box will magically bring to you every day. These will be the same foods every day, so your choices have to be very specific. If you choose foods like pizza, which hold no nutritional value, you will die within a few months.

I was inspired to write this blog post because of the story about Napoleon. He was imprisoned for a while in the 1800s, and was asked what food he would like in prison. His captors would serve him the same food daily, hoping that he would expire “naturally” out of malnutrition.

But here’s how to trick the magic box (or your captors) to give you the highest bang for your buck, not only keeping you alive until rescue arrives, but to make you thrive!

1. 150 gr of pastured beef or goat heart
The heart must be cooked in bone broth with onion, garlic, sea salt, and grass-fed butter. We pick a heart over muscle meat because it’s more nutritious and it has the highest levels of CoQ10. We don’t pick liver because its extremely high levels of copper and A will work against your health if consumed daily (although liver must be consumed once a week on different living conditions, of course).

2. 150 gr of wild Alaskan salmon, sashimi raw
Very high levels of omega-3 and many other nutrients. Wild sardines or wild trout would be my No2 choices.

3. 100 gr of wild, raw oysters
If liver is the most nutrient-dense food of all, oysters is the No2. We pick it because of its high levels of zinc among others.

4. 1 pastured duck egg, raw or fried
Chicken eggs hold nothing to duck eggs. Duck eggs are more nutrient dense per gram, and they create fewer allergies than chicken eggs. Cooked in coconut oil if fried.

5.30 gr of sunflower seeds, soaked for 4 hours
These have very high levels of B1 and E (higher than that of almonds). Extra B1 is needed on a diet that doesn’t contain legumes or enriched flours.

6. 100 gr kale, raw
The most nutrient-dense vegetable. Served with extra virgin olive oil and sea salt.

7. 250 gr of white potatoes, baked
We need some starches, no matter what keto people say. We carry friendly bacteria in our gut that can only live on starch. White potatoes also carry the important resistant type of starch.

8. 100 gr of asparagus, or 100 gr of an avocado
Asparagus seared in coconut oil. Avocados would be my personal second choice, but you might want to pick that instead, if you are after more fiber and even more nutrition.

9. 500 gr home-made raw, full-fat goat kefir
Fermented foods are needed, and some calcium too. Kefir is the definite powerhouse in this case.

10. 1 pink grapefruit per day
Antioxidants and enough vitamin C. If grapefruits are unavailable (due to season), 150 gr of blueberries will do.

And of course, lots of natural, spring water.

As you can see below (click to view larger), that set of foods daily give you pretty much over 100% of your daily needs for most needed nutrients. You can’t go wrong with these!

Don’t be afraid about the trans-fats shown there and the low omega-3 shown in the chart. The database that contained similar foods did not include their pastured/wild versions, and so the data in that respect are a bit skewed. The vitamins/minerals are pretty much correct though.


published by Eugenia on 2014-07-27 03:56:47 in the "Recipes" category
Eugenia Loli-Queru

Cauliflower fried rice is the best substitute for Chinese fried rice on low carb and grain-free Paleo diets. Here’s a generic recipe for it, but accompanied with hints and tips on how to make the recipe work best. You see, if you treat cauliflower like rice, you will end up with a mushy, cabbage-smelling dish. Following the tips below, will bring your fried cauliflower much closer to the real thing.

Ingredients (for 2)
* Half a cauliflower head, in small florets
* 2 chicken eggs, or 1 duck egg
* 4 tablespoons of olive oil
* 1 small leek, cleaned and chopped
* 1/2 cup of frozen peas
* 1 cup of boneless chicken, or shelled shrimp
* 1/2 cup of mushrooms, chopped (and/or carrots, peppers, broccoli etc)
* 1 green onion, chopped
* 1 clove of garlic, chopped
* 1.5 tablespoons of gluten-free tamari soy sauce, or coconut aminos
* 1 teaspoon of turmeric (optional)
* black pepper to taste

Method
1. On a small frying pan, with a tablespoon of olive oil, crack an egg on low heat. Using a wooden spoon, swirl continuously the egg, until you achieve a scrambled egg consistency. Turn off the heat before the egg is fully cooked, set aside.
2. On a wok or frying pan, add the chicken (or shrimp), 1.5 tablespoons of olive oil, peas, mushrooms/veggies, leeks, garlic, and black pepper. Stir occasionally. Add the soy sauce. Cook in medium heat until the chicken is done and the leeks have become transparent and soft, and there’s no liquid left in the pan. Set aside.
3. Using a food processor and its S blade, add half of the cauliflower in it. Give it 5-6 jolts until the cauliflower has become “riced”. Do not make the pieces too small, but it should still feel a bit chunky. Set aside, and process the rest of the cauliflower.
4. On a very large frying pan (I used a 14″) that is not wok-like (but rather it has a flat surface), add 1.5 tablespoons of olive oil, the turmeric (if using), and the cauliflower. Under high heat, fry the cauliflower, stirring occasionally, until it starts to get burned marks and it starts to feel dry.
5. Add the meat mixture in to the big frying pan with the cauliflower rice, and stir. Add the scrambled eggs, green onion, and stir. A minute later, turn off the heat, and serve.

Tips
1. We use a very large, leveled frying pan instead of a wok because woks tend to trap moist. We’re trying to get rid of as much moist from the dish, because it’s that moist that brings the cabbage smell to cauliflower.
2. We’re using leeks because these emulate the sweetness of rice. Without it, the dish comes out a bit flat in taste.
3. Do not process the cauliflower too much, or too much moist will come out of them.
4. Do not crack the egg on the same pan as the cauliflower. While scrambling the egg on the side of the pan works with rice, it doesn’t work as well for cauliflower. Same goes for the meat mixture.


published by noreply@blogger.com (Miguel Alatorre) on 2014-07-24 13:00:00 in the "import" category

For a Python project I'm working on, I wrote a parent class with multiple child classes, each of which made use of various modules that were imported in the parent class. A quick solution to making these modules available in the child classes would be to use wildcard imports in the child classes:

from package.parent import *

however, PEP8 warns against this stating "they make it unclear which names are present in the namespace, confusing both readers and many automated tools."

For example, suppose we have three files:

# a.py
import module1
class A(object):
    def __init__():
        pass
# b.py
import module2
class B(A):
    def __init__():
        super(B, self).__init__()
# c.py
class C(B):
    def __init__():
        super(C, self).__init__()

To someone reading just b.py or c.py, it is unknown that module1 is present in the namespace of B and that both module1 and module2 are present in the namespace of C. So, following PEP8, I just explicitly imported any module needed in each child class. Because in my case there were many imports and because it seemed repetitive to have all those imports duplicated in each of the many child classes, I wanted to find out if there was a better solution. While I still don't know if there is, I did go down the road of how imports work in Python, at least for 3.4.1, and will share my notes with you.

Python allows you to import modules using the import statement, the built-in function __import__(), and the function importlib.import_module(). The differences between these are:

The import statement first "searches for the named module, then it binds the results of that search to a name in the local scope" (Python Documentation). Example:

Python 3.4.1 (default, Jul 15 2014, 13:05:56) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>
>>> re.sub('s', '', 'bananas')
'banana'

Here the import statement searches for a module named re then binds the result to the variable named re. You can then call re module functions with re.function_name().

A call to function __import__() performs the module search but not the binding; that is left to you. Example:

>>> muh_regex = __import__('re')
>>> muh_regex
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>
>>> muh_regex.sub('s', '', 'bananas')
'banana'

Your third option is to use importlib.import_module() which, like __import__(), only performs the search:

>>> import importlib
>>> muh_regex = importlib.import_module('re')
>>> muh_regex
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>
>>> muh_regex.sub('s', '', 'bananas')
'banana'

Let's now talk about how Python searches for modules. The first place it looks is in sys.modules, which is a dictionary that caches previously imported modules:

>>> import sys
>>> 're' in sys.modules
False
>>> import re
>>> 're' in sys.modules
True
>>> sys.modules['re']
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>

If the module is not found in sys.modules Python searches sys.meta_path, which is a list that contains finder objects. Finders, along with loaders, are objects in Python's import protocol. The job of a finder is to return a module spec, using method find_spec(), containing the module's import-related information which the loader then uses to load the actual module. Let's see what I have in my sys.meta_path:

>>> sys.meta_path
[<class '_frozen_importlib.BuiltinImporter'>, <class '_frozen_importlib.FrozenImporter'>, <class '_frozen_importlib.PathFinder'>]

Python will use each finder object in sys.meta_path until the module is found and will raise an ImportError if it is not found. Let's call find_spec() with parameter 're' on each of these finder objects:

>>> sys.meta_path[0].find_spec('re')
>>> sys.meta_path[1].find_spec('re')
>>> sys.meta_path[2].find_spec('re')
ModuleSpec(name='re', loader=<_frozen_importlib.SourceFileLoader object at 0x7ff7eb314438>, origin='/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py')

The first finder knows how to find built-in modules and since re is not a built-in module, it returns None.

>>> 're' in sys.builtin_module_names
False

The second finder knows how to find frozen modules, which re is not. The third knows how to find modules from a list of path entries called an import path. For re the import path is sys.path but for subpackages the import path can be the parent's __path__ attribute.

>>>sys.path
['', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/site-packages/distribute-0.6.49-py3.4.egg', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python34.zip', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/plat-linux', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/lib-dynload', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/site-packages', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/site-packages/setuptools-0.6c11-py3.4.egg-info']

Once the module spec is found, the loading machinery takes over. That's as far as I dug but you can read more about the loading process by reading the documentation.


published by noreply@blogger.com (Kirk Harr) on 2014-07-24 10:58:00 in the "bash" category

When working with shell scripts written in bash/csh/etc. one of the primary tools you have to rely on is a simple method of piping output and input from subprocesses called by the script to create complex logic to accomplish the goal of the script. When working with python, this same method of calling subprocesses to redirect the input/output is available, but the overhead of using this method in python would be so cumbersome as to make python a less desirable scripting language. In effect you were implementing large parts of the I/O facilities, and potentially even writing replacements for the existing shell utilities that would perform the same work. Recently, python developers attempted to solve this problem, by updating an existing python subprocess wrapper library called pbs, into an easier to use library called sh.

Sh can be installed using pip, and the author has posted some documentation for the library here: http://amoffat.github.io/sh/

Using the sh library

After installing the library into your version of python, there will be two ways to call any existing shell command available to the system, firstly you can import the command as though it was itself a python library:

from sh import hostname
print(hostname())
In addition, you can also call the command directly by just referencing the sh namespace prior to the command name:
import sh
print(sh.hostname())

When running this command on my linux workstation (hostname atlas) it will return the expected results:

Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sh
>>> print(sh.hostname())
atlas

However at this point, we are merely replacing a single shell command which prints output to the screen, the real benefit of the shell scripts was that you could chain together commands in order to create complex logic to help you do work.

Advanced Gymnastics

A common use of shell scripts is to provide administrators the ability to quickly filter log file output and to potentially search for specific conditions within those logs, to alert in the event that an application starts throwing errors. With python piping in sh we can create a simple log watcher, which would be capable of calling anything we desire in python when the log file contains any of the conditions we are looking for.

To pipe together commands using the sh library, you would encapsulate each command in series to create a similar syntax to bash piping:

>>> print(sh.wc(sh.ls("-l", "/etc"), "-l"))
199

This command would have been equivalent to the bash pipe of "ls -l /etc | wc -l" indicating that the long listing of /etc on my workstation contained 199 lines of output. Each piped command is encapsulated inside the parenthesis of the command the precedes it.

For our log listener we will use the tail command along with a python iterator to watch for a potential error condition, which I will represent with the string "ERROR":

>>> for line in sh.tail("-f", "/tmp/test_log", _iter=True):
...     if "ERROR" in line:
...         print line

In this example, once executed, python will call the tail command to follow a particular log file. It will iterate over each line of output produced by tail and if any of the lines contain the string we are watching for python will print that line to standard output. At this point, this would be similar to using the tail command and piping the output to a string search command, like grep. However, you could replace the third line of the python with a more complex action, emailing the error condition out to a developer or administrator for review, or perhaps initiating a procedure to recover from the error automatically.

Conclusions

In this manner with just a few lines of python, much like with bash, one could create a relatively complex process without recreating all the shell commands which already perform this work, or create a convoluted wrapping process of passing output from command to command. This combination of the existing shell commands and the power of python; you get all the functions available to any python environment, with the ease of using the shell commands to do some of the work. In the future I will definitely be using this python library for my own shell scripting needs, as I have generally preferred the syntax and ease of use of python over that of bash, but now I will be able to enjoy both at the same time.


published by noreply@blogger.com (Greg Sabino Mullane) on 2014-07-24 05:18:00 in the "Bucardo" category

Image by Flickr user Rebecca Siegel (cropped)

Bucardo's much publicized ability to handle multiple data sources often raises questions about conflict resolution. People wonder, for example, what happens when a row in one source database gets updated one way, and the same row in another source database gets updated a different way? This article will explain some of the solutions Bucardo uses to solve conflicts. The recently released Bucardo 5.1.1 has some new features for conflict handling, so make sure you use at least that version.

Bucardo does multi-source replication, meaning that users can write to more than one source at the same time. (This is also called multi-master replication, but "source" is a much more accurate description than "master"). Bucardo deals in primary keys as a way to identify rows. If the same row has changed on one or more sources since the last Bucardo run, a conflict has arisen and Bucardo must be told how to handle it. In other words, Bucardo must decide which row is the "winner" and thus gets replicated to all the other databases.

For this demo, we will again use an Amazon AWS. See the earlier post about Bucardo 5 for directions on installing Bucardo itself. Once it is installed (after the './bucardo install' step), we can create some test databases for our conflict testing. Recall that we have a handy database named "shake1". As this name can get a bit long for some of the examples below, let's make a few databases copies with shorter names. We will also teach Bucardo about the databases, and create a sync named "ctest" to replicate between them all:

createdb aa -T shake1
createdb bb -T shake1
createdb cc -T shake1
bucardo add db A,B,C dbname=aa,bb,cc
## autokick=0 means new data won't replicate right away; useful for conflict testing!
bucardo add sync ctest dbs=A:source,B:source,C:source tables=all autokick=0
bucardo start

Bucardo has three general ways to handle conflicts: built in strategies, a list of databases, or using custom conflict handlers. The primary strategy, and also the default one for all syncs, is known as bucardo_latest. When this strategy is invoked, Bucardo scans all copies of the conflicted table across all source databases, and then orders the databases according to when they were last changed. This generates a list of databases, for example "B C A". For each conflicting row, the database most recently updated - of all the ones involved in the conflict for that row - is the winner. The other built in strategy is called "bucardo_latest_all_tables", which scans all the tables in the sync across all source databases to find a winner.

There may be other built in strategies added as experience/demand dictates, but it is hard to develop generic solutions to the complex problem of conflicts, so non built-in strategies are preferred. Before getting into those other solutions, let's see the default strategy (bucardo_latest) in action:

## This is the default, but it never hurts to be explicit:
bucardo update sync ctest conflict=bucardo_latest
Set conflict strategy to 'bucardo_latest'
psql aa -c "update work set totalwords=11 where title~'Juliet'"; 
psql bb -c "update work set totalwords=21 where title~'Juliet'"; 
psql cc -c "update work set totalwords=31 where title~'Juliet'"
UPDATE 1
UPDATE 1
UPDATE 1
bucardo kick sync ctest 0
Kick ctest: [1 s] DONE!
## Because cc was the last to be changed, it wins:
for i in {aa,bb,cc}; do psql $i -tc "select current_database(), 
totalwords from work where title ~ 'Juliet'"; done
aa   |   31
bb   |   31
cc   |   31

Under the hood, Bucardo actually applies the list of winning databases to each conflicting row, such that example above of "B C A" means that database B wins in a conflict in which a rows was updated by B and C, or B and A, or B and C and A. However, if B did not change the row, and the conflict is only between C and A, then C will win.

As an alternative to the built-ins, you can set conflict_strategy to a list of the databases in the sync, ordered from highest priority to lowest, for example "C B A". The list does not have to include all the databases, but it is a good idea to do so. Let's see it in action. We will change the conflict_strategy for our test sync and then reload the sync to have it take effect:


bucardo update sync ctest conflict='B A C'
Set conflict strategy to 'B A C'
bucardo reload sync ctest
Reloading sync ctest...Reload of sync ctest successful
psql aa -c "update work set totalwords=12 where title~'Juliet'"; 
psql bb -c "update work set totalwords=22 where title~'Juliet'"; 
psql cc -c "update work set totalwords=32 where title~'Juliet'"
UPDATE 1
UPDATE 1
UPDATE 1
bucardo kick sync ctest 10
Kick ctest: [1 s] DONE!
## This time bb wins, because B comes before A and C
for i in {aa,bb,cc}; do psql $i -tc "select current_database(), 
totalwords from work where title ~ 'Juliet'"; done
aa   |   22
bb   |   22
cc   |   22

The final strategy for handling conflicts is to write your own code. Many will argue this is the best approach. It is certaiy the only one that will allow you to embed your business logic into the conflict handling.

Bucardo allows loading of snippets of Perl code known as "customcodes". These codes take effect at specified times, such as after triggers are disabled, or when a sync has failed because of an exception. The specific time we want is called "conflict", and it is an argument to the "whenrun" attribute of the customcode. A customcode needs a name, the whenrun argument, and a file to read in for its content. They can also be associated with one or more syncs or tables.

Once a conflict customcode is in place and a conflict is encountered, the code will be invoked, and it will in turn pass information back to Bucardo telling it how to handle the conflict.

The code should expect a single argument, a hashref containing information about the current sync. This hashref tells the current table, and gives a list of all conflicted rows. The code can tell Bucardo which database to consider the winner for each conflicted row, or it can simply declare a winning database for all rows, or even for all tables. It can even modify the data in any of the tables itself. What it cannot do (thanks to the magic of DBIx::Safe) is commit, rollback, or do other dangerous actions since we are in the middle of an important transaction.

It's probably best to show by example at this point. Here is a file called ctest1.pl that asks Bucardo to skip to the next applicable customcode if the conflict is in the table "chapter". Otherwise, it will tell it to have database "C" win all conflicts for this table, and fallback to the database "B" otherwise.

## ctest1.pl - a sample conflict handler for Bucardo
use strict;
use warnings;

my $info = shift;
## If this table is named 'chapter', do nothing
if ($info->{tablename} eq 'chapter') {
    $info->{skip} = 1;
}
else {
    ## Winning databases, in order
    $info->{tablewinner} = 'C B A';
}
return;

Let's add in this customcode, and associate it with our sync. Then we will reload the sync and cause a conflict.

bucardo add customcode ctest 
  whenrun=conflict src_code=ctest1.pl sync=ctest
Added customcode "ctest"
bucardo reload sync ctest
Reloading sync ctest...Reload of sync ctest successful
psql aa -c "update work set totalwords=13 where title~'Juliet'"; 
psql bb -c "update work set totalwords=23 where title~'Juliet'"; 
psql cc -c "update work set totalwords=33 where title~'Juliet'"
UPDATE 1
UPDATE 1
UPDATE 1
bucardo kick sync ctest 0
Kick ctest: [1 s] DONE!
## This time cc wins, because we set all rows to 'C B A'
for i in {aa,bb,cc}; do psql $i -tc "select current_database(), 
totalwords from work where title ~ 'Juliet'"; done
aa   |   33
bb   |   33
cc   |   33

We used the 'skip' hash value to tell Bucardo to not do anything if the table is named "chapter'. In real life, we would have another customcode that will handle the skipped table, else any conflict in it will cause the sync to stop. Any number of customcodes can be attached to syncs or tables.

The database preference will last for the remainder of this sync's run, so any other conflicts in other tables will not even bother to invoke the code. You can use the hash key "tablewinneralways" to make this decision sticky, in that it will apply for all future runs by this sync (its KID technically) - which effectively means the decision stays until Bucardo restarts.

One of the important structures sent to the code is a hash named "conflicts", which contains all the changed primary keys, and, for each one, a list of which databases were involved in the sync. A Data::Dumper peek at it would look like this:

$VAR1 = {
  'romeojuliet' => {
    'C' => 1,
    'A' => 1,
    'B' => 1,
  }
};

The job of the conflict handling code (unless using one of the "winner" hash keys) is to change each of those conflicted rows from a hash of involved databases into a string describing the preferred order of databases. The Data::Dumper output would thus look like this:

$VAR1 = {
  'romeojuliet' => 'B'
};

The code snippet would look like this:

## ctest2.pl - a simple conflict handler for Bucardo.
use strict;
use warnings;

my $info = shift;
for my $row (keys %{ $info->{conflicts} }) {
  ## Equivalent to 'A C B'
  $info->{conflicts}{$row} = exists $info->{conflicts}{$row}{A} ? 'A' : 'C';
}

## We don't want any other customcodes to fire: we have handled this!
$info->{lastcode} = 1;
return;

Let's see that code in action. Assuming the above "bucardo add customcode" command was run, we will need to load an updated version, and then reload the sync. We create some conflicts, and check on the results:


bucardo update customcode ctest src_code=ctest2.pl
Changed customcode "ctest" src_code with content of file "ctest2.pl"
bucardo reload sync ctest
Reloading sync ctest...Reload of sync ctest successful
psql aa -c "update work set totalwords=14 where title~'Juliet'"; 
psql bb -c "update work set totalwords=24 where title~'Juliet'"; 
psql cc -c "update work set totalwords=34 where title~'Juliet'"
UPDATE 1
UPDATE 1
UPDATE 1
bucardo kick sync ctest 10
Kick ctest: [2 s] DONE!
## This time aa wins, because we set all rows to 'A C B'
for i in {aa,bb,cc}; do psql $i -tc "select current_database(), 
totalwords from work where title ~ 'Juliet'"; done
aa   |   14
bb   |   14
cc   |   14

That was an obviously oversimplified example, as we picked 'A' for no discernible reason! These conflict handlers can be quite complex, and are only limited by your imagination - and your business logic. As a final example, let's have the code examine some other things in the database, and as well as jump out of the database itself(!) to determine the resolution to the conflict:

## ctest3.pl - a somewhat silly conflict handler for Bucardo.
use strict;
use warnings;
use LWP;

my $info = shift;

## What is the weather in Walla Walla, Washington?
## If it's really hot, we cannot trust server A
my $max_temp = 100;
my $weather_url = 'http://wxdata.weather.com/wxdata/weather/rss/local/USWA0476?cm_ven=LWO&cm_cat=rss';
my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => $weather_url);
my $response = $ua->request($req)->content();
my $temp = ($response =~ /(d+) °/) ? $1 : 75;
## Store in our shared hash so we don't have to look it up every run
## Ideally we'd add something so we only call it if the temp has not been checked in last hour
$info->{shared}{wallawallatemp} = $temp;

## We want to count the number of sessions on each source database
my $SQL = 'SELECT count(*) FROM pg_stat_activity';
for my $db (sort keys %{ $info->{dbinfo} }) {
    ## Only source databases can have conflicting rows
    next if ! $info->{dbinfo}{$db}{issource};
    ## The safe database handles are stored in $info->{dbh}
    my $dbh = $info->{dbh}{$db};
    my $sth = $dbh->prepare($SQL);
    $sth->execute();
    $info->{shared}{dbcount}{$db} = $sth->fetchall_arrayref()->[0][0];
}

for my $row (keys %{ $info->{conflicts} }) {
    ## If the temp is too high, remove server A from consideration!
    if ($info->{shared}{wallawallatemp} > $max_temp) {
        delete $info->{conflicts}{$row}{A}; ## May not exist, but we delete anyway
    }

    ## Now we can sort by number of connections and let the least busy db win
    (my $winner) = sort {
        $info->{shared}{dbcount}{$a} <=> $info->{shared}{dbcount}{$b}
        or
        ## Fallback to reverse alphabetical if the session counts are the same
        $b cmp $a
    } keys %{ $info->{conflicts}{$row} };

    $info->{conflicts}{$row} = $winner;
}

## We don't want any other customcodes to fire: we have handled this!
$info->{lastcode} = 1;
return;

We'll forego the demo: suffice to say that B always won in my tests, as Walla Walla never got over 97, and all my test databases had the same number of connections. Note some of the other items in the $info hash: "shared" allows arbitrary data to be stored across invocations of the code. The "lastcode" key tells Bucardo not to fire any more customcodes. While this example is very impractical, it does demonstrate the power available to you when solving conflicts.

Hopefully this article answers many of the questions about conflict handling with Bucardo. Suggestions for new default handlers and examples of real-world conflict handlers are particularly welcome, as well as any other questions or comments. You can find the mailing list at bucardo-general@bucardo.org, and subscribe by visiting the bucardo-general Info Page.