All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors, or our affiliates.
  Add to My Yahoo!  Subscribe with Bloglines  Subscribe in NewsGator Online

published by noreply@blogger.com (Steph Skardal) on 2015-03-04 19:13:00

A while back, I wrote about my work on Ruby on Rails based H2O with Annotator, an open source JavaScript library that provides annotation functionality. In that article, I discussed the history of my work with annotations, specifically touching on several iterations with native JavaScript functionality developed over several years to handle colored highlight overlapping on selected text.

Finally, last July we completed the transition to Annotator from custom JavaScript development, with the caveat that the application had quite a bit of customization hooked into Annotator's easily extensible library. But, just a few months after that, I revisited the customization, described in this post.

Separation of UI Concerns

In our initial work with Annotator, we had extended it to offer a few additional features:

  • Add multiple tags with colored highlights, where content can be tagged on the fly with a color chosen from a set of predefined colors assigned to the tag. Text would then be highlighted with opacity, and colors combined (using xColor) on overlapping highlights.
  • Interactive functionality to hide and show un-annotated text, as well as hide and show annotated text with specific tags.
  • Ability to link annotated text to other pieces of content.

But you know the paradox of choice? The extended Annotator, with so many additional features, was offering too many choices in a cluttered user interface, where the choices were likely not used in combination. See:

Too many annotation options: Should it be hidden? Should it be tagged?
Should it be tagged with more than one tag? Can text be tagged and hidden?

So, I separated concerns here (a common term in software) to intentionally separate annotation features. Once a user selects text, a popup is shown to move forward with adding a comment, adding highlights (with or without a tag name), adding a link, or hiding that text:


The new interface, where a user chooses the type of annotation they are saving, and only relevant fields are then visible.


After the user clicks on the highlight option, only highlight related fields are shown.

Annotator API

The functionality required to intercept and override Annotator's default behavior and required core overrides, but the API has a few nice hooks that were leveraged to accommodate this functionality in the H2O plugin:

  • annotationsLoaded: called after annotation data loaded
  • annotationsEditorSubmit: called after user saves annotation, before data sent to server
  • annotationCreated: after annotation created
  • annotationUpdated: after annotation updated
  • annotationDeleted: after annotation deleted

While these hooks don't mean much to someone who hasn't worked with Annotator, the point is that there are several ways to extend Annotator throughout the CRUD (create, read, update, destroy) actions on an annotation.

Custom Data

On the backend, the four types of annotation are contained in a single table, as was the data model prior to this work. There are several additional data fields to indicate the type of annotation:

  • hidden (boolean): If true, text is not visible.
  • link (text): Annotation can link to any other URL.
  • highlight (color only, no tag): Annotation can be assigned a single colored highlight.
  • highlight + tag (separate table, using various Rails plugins (acts_as_taggable_on)): Annotation can be assigned a single tag with a corresponding colored highlight.

Conclusion

The changes here resulted in a less cluttered, more clear interface. To a user, each annotation has a single concern, while utilizing Annotator and saving to a single table.


published by noreply@blogger.com (Jacob Minshall) on 2015-03-04 12:30:00 in the "Chef" category
SCaLE Penguin

I recently went to the Southern California Linux Expo (SCaLE). It takes place in Los Angeles at the Hilton, and is four days of talks, classes, and more, all focusing around Linux. SCaLE is the largest volunteer run open source conference. The volunteers put a lot of work into the conference, from the nearly flawless wireless network to the AV team making it as easy as plugging in a computer to start a presentation.


One large focus of the conference was the growing DevOps community in the Linux world. The more DevOps related talks drew the biggest crowds, and there was even a DevOps focused room on Friday. There are a wide range of DevOps related topics but the two that seemed to draw the largest crowds were configuration management and containerization. I decided to attend a full day talk on Chef (a configuration management solution) and Docker (the new rage in containerization).


The Thursday Chef talk was so full that they decided to do an extra session on Sunday. The talk was more of an interactive tutorial than a lecture, so everyone was provided with an AWS instance to use as their Chef playground. The talk started with the basics of creating a file, installing a package, and running a service. It was all very interactive; there would be a couple of slides explaining a feature and then there was time provided to try it out. During the talk there was a comment from someone about a possible bug in Chef, concerning the suid bit being reset after a change of owner or group to a file. The presenter, who works for the company that creates Chef, wasn't sure what would happen and said, "Try it out." I did try it out, and there was a bug in Chef. The presenter suggested I file an issue on github, so I did and I even wrote a patch and made a pull request later on that weekend.


Containers were the other hot topic that weekend, with the half day class on Friday, and a few other talks throughout the weekend. The Docker Talk was also set up in a learn by doing style. We learned the basics of downloading and running Docker images from the Docker Hub through the command line. We added our own tweaks to the tops of those images and created new images of our own. The speaker, Jerome Petazzoni, usually gives a two or three day class on the subject, so he picked the parts he thought most interesting to share with us. I really enjoyed making a Docker File which describes the creation of a new machine from a base image. I also thought one of the use cases described for Docker to be very interesting, creating a development environment for employees at a company. There is usually some time wasted moving things from machine to machine, whether upgrading a personal machine or transferring a project from one employee to another, especially when they are using different operating systems. Docker can help to create a unified state for all development machines in a company to the point where setting a new employee up with a workspace can be accomplished in a matter of minutes. This also helps to bring the development environment closer to the production environment.


One sentiment I heard reiterated in multiple DevOps talks was the treatment of servers as Pets vs. Cattle. Previously servers were treated as pets. We gave servers names, we knew what they liked and didn't like, when they got sick we'd nurse them back to health. This kind of treatment for servers is time consuming and not manageable at the scale that many companies face. The new trend is to treat servers like cattle. Each server is given a number, they do their job, and if they get sick they are "put down". Tools like Docker and Chef make this possible, servers can be set up so quickly that there's no reason to nurse them back to health anymore. This is great for large companies that need to manage thousands of servers, but it can save time for smaller companies as well.


published by noreply@blogger.com (Jeff Boes) on 2015-02-25 15:00:00 in the "database" category

SQL queries can get complex in a big hurry. If you are querying multiple tables, and in particular if your query involves operations like UNION and INTERSECT, then you can find yourself in a big, messy pile of SQL. Worse, that big pile will often run slowly, sometimes to the point where a web application times out!

I won't inflict a real-world example on you; that would be cruel. So let's look at a "toy" problem, keeping in mind that this won't illustrate any time-savings, just the technique involved.

Here's the original SQL:

SELECT p.* FROM products p
JOIN (SELECT * FROM inventory WHERE /* complex clause here */) i USING (sku)
UNION ALL
SELECT p.* FROM clearance_products p
JOIN (SELECT * FROM inventory WHERE /* complex clause here */) i USING (sku)

Bonus hint: using "UNION ALL" instead of just "UNION" will allow the query processor to skip an unnecessary step here. "UNION ALL" says you know the rows on either side of the clause are unique. "UNION" means the results will be post-processed to remove duplicates. This might save you more than a smidgen of time, depending on how large the two sub-queries get.

Now, many times the query optimizer will just do the right thing here. But sometimes (cough, cough-MySQL), your database isn't quite up to the task. So you have to shoulder the burden and help out. That's where we can apply a temporary table.

Temporary tables are created for the length of the database session; that's different than a transaction. For a web application, that's usually (not always) the length of the request (i.e., from the time your web application opens a database connection, until it explicitly closes it, or until it returns control to the web server, usually by passing it a completed page). For a script, it's a similar duration, e.g. until the script exits.

CREATE TEMPORARY TABLE cross_inventory AS
SELECT * FROM inventory WHERE /* complex clause here */;

CREATE INDEX cross_inv_sku ON cross_inventory(sku);
There's no significant difference for our purposes between a "permanent" and a "temporary" table. However, you do have to keep in mind that these tables are created without indexes, so if your goal is to improve the speed of queries involving the data here, adding an index after creating the table is usually desirable.

With all this in place, now we can:

SELECT p.* FROM products p
JOIN cross_inventory i USING (sku)
UNION
SELECT p.* FROM clearance_products p
JOIN cross_inventory i USING (sku)

Sometimes your temporary table will be built up not by a straightforward "CREATE ... AS SELECT ...", but by your application:

CREATE TEMPORARY TABLE tmp_inventory AS SELECT * FROM inventory WHERE false;
CREATE INDEX tmp_inv_sku ON tmp_inventory(sku);

And then within the application:

# Pseudocode
while (more_data) {
  row = build_inv_record(more_data);
  sql_do('INSERT INTO tmp_inventory VALUES (?,?,...)', row);
}

Here, we are creating an empty "inventory" table template as a temporary table ("SELECT * FROM inventory WHERE false"), then adding rows to it from the application, and finally running our query. Note that in a practical application of this, it's not likely to be a lot faster, because the individual INSERT statements will take time. But this approach may have some utility where the existing "inventory" table doesn't have the data we want to JOIN against, or has the data, but not in a way we can easily filter.

I've used temporary tables (in a MySQL/Interchange/Perl environment) to speed up a query by a factor of two or more. It's usually in those cases where you have a complex JOIN that appears in two or more parts of the query (again, usually a UNION). I've even had big-win situations where the same temporary table was used in two different queries during the same session.

A similar approach is the Common Table Expression (CTE) found in PostgreSQL starting with version 8.4. This allows you to identify the rows you would be pouring into your temporary table as a named result-set, then reference it in your query. Our "toy" example would become:

WITH cross_inventory AS
(SELECT * FROM inventory WHERE /* complex clause here */)
SELECT p.* FROM products p
JOIN cross_inventory i USING (sku)
UNION
SELECT p.* FROM clearance_products p
JOIN cross_inventory i USING (sku)

I've not had an opportunity to use CTEs yet, and of course they aren't available in MySQL, so the temporary-table technique will still have a lot of value for me in the foreseeable future.


published by noreply@blogger.com (Greg Sabino Mullane) on 2015-02-24 12:00:00 in the "databases" category

Way back in 2005 I added the ON_ERROR_ROLLBACK feature to psql, the Postgres command line client. When enabled, any errors cause an immediate rollback to just before the previous command. What this means is that you can stay inside your transaction, even if you make a typo (the main error-causing problem and the reason I wrote it!). Since I sometimes see people wanting to emulate this feature in their application or driver, I thought I would explain exactly how it works in psql.

First, it must be understood that this is not a Postgres feature, and there is no way you can instruct Postgres itself to ignore errors inside of a transaction. The work must be done by a client (such as psql) that can do some voodoo behind the scenes. The ON_ERROR_ROLLBACK feature is available since psql version 8.1.

Normally, any error you make will throw an exception and cause your current transaction to be marked as aborted. This is sane and expected behavior, but it can be very, very annoying if it happens when you are in the middle of a large transaction and mistype something! At that point, the only thing you can do is rollback the transaction and lose all of your work. For example:

greg=# CREATE TABLE somi(fav_song TEXT, passphrase TEXT, avatar TEXT);
CREATE TABLE
greg=# begin;
BEGIN
greg=# INSERT INTO somi VALUES ('The Perfect Partner', 'ZrgRQaa9ZsUHa', 'Andrastea');
INSERT 0 1
greg=# INSERT INTO somi VALUES ('Holding Out For a Hero', 'dx8yGUbsfaely', 'Janus');
INSERT 0 1
greg=# INSERT INTO somi BALUES ('Three Little Birds', '2pX9V8AKJRzy', 'Charon');
ERROR:  syntax error at or near "BALUES"
LINE 1: INSERT INTO somi BALUES ('Three Little Birds', '2pX9V8AKJRzy'...
greg=# INSERT INTO somi VALUES ('Three Little Birds', '2pX9V8AKJRzy', 'Charon');
ERROR:  current transaction is aborted, commands ignored until end of transaction block
greg=# rollback;
ROLLBACK
greg=# select count(*) from somi;
 count
-------
     0

When ON_ERROR_ROLLBACK is enabled, psql will issue a SAVEPOINT before every command you send to Postgres. If an error is detected, it will then issue a ROLLBACK TO the previous savepoint, which basically rewinds history to the point in time just before you issued the command. Which then gives you a chance to re-enter the command without the mistake. If an error was not detected, psql does a RELEASE savepoint behind the scenes, as there is no longer any reason to keep the savepoint around. So our example above becomes:

greg=# set ON_ERROR_ROLLBACK interactive
greg=# begin;
BEGIN
greg=# INSERT INTO somi VALUES ('Volcano', 'jA0EBAMCV4e+-^', 'Phobos');
INSERT 0 1
greg=# INSERT INTO somi VALUES ('Son of a Son of a Sailor', 'H0qHJ3kMoVR7e', 'Proteus');
INSERT 0 1
greg=# INSERT INTO somi BALUES ('Xanadu', 'KaK/uxtgyT1ni', 'Metis');
ERROR:  syntax error at or near "BALUES"
LINE 1: INSERT INTO somi BALUES ('Xanadu', 'KaK/uxtgyT1ni'...
greg=# INSERT INTO somi VALUES ('Xanadu', 'KaK/uxtgyT1ni', 'Metis');
INSERT 0 1
greg=# commit;
COMMIT
greg=# select count(*) from somi;
 count
-------
     3

What about if you create a savepoint yourself? Or even a savepoint with the same name as the one that psql uses internally? Not a problem - Postgres allows multiple savepoints with the same name, and will rollback or release the latest one created, which allows ON_ERROR_ROLLBACK to work seamlessly with user-provided savepoints.

Note that the example above sets ON_ERROR_ROLLBACK (yes it is case sensitive!) to 'interactive', not just 'on'. This is a good idea, as you generally want it to catch human errors, and not just plow through a SQL script.

So, if you want to add this to your own application, you will need to wrap each command in a hidden savepoint, and then rollback or release it. The end-user should not see the SAVEPOINT, ROLLBACK TO, or RELEASE commands. Thus, the SQL sent to the backend will change from this:

BEGIN; ## entered by the user
INSERT INTO somi VALUES ('Mr. Roboto', 'H0qHJ3kMoVR7e', 'Triton');
INSERT INTO somi VALUES ('A Mountain We Will Climb', 'O2DMZfqnfj8Tle', 'Tethys');
INSERT INTO somi BALUES ('Samba de Janeiro', 'W2rQpGU0MfIrm', 'Dione');

to this:

BEGIN; ## entered by the user
SAVEPOINT myapp_temporary_savepoint ## entered by the application
INSERT INTO somi VALUES ('Mr. Roboto', 'H0qHJ3kMoVR7e', 'Triton');
RELEASE myapp_temporary_savepoint

SAVEPOINT myapp_temporary_savepoint
INSERT INTO somi VALUES ('A Mountain We Will Climb', 'O2DMZfqnfj8Tle', 'Tethys');
RELEASE myapp_temporary_savepoint

SAVEPOINT myapp_temporary_savepoint
INSERT INTO somi BALUES ('Samba de Janeiro', 'W2rQpGU0MfIrm', 'Dione');
ROLLBACK TO myapp_temporary_savepoint

Here is some pseudo-code illustrating the sequence of events. To see the actual implementation in psql, take a look at bin/psql/common.c


run("SAVEPOINT myapp_temporary_savepoint");
run($usercommand);
if (txn_status == ERROR) {
  run("ROLLBACK TO myapp_temporary_savepoint");
}
if (command was "savepoint" or "release" or "rollback") {
  ## do nothing
}
elsif (txn_status == IN_TRANSACTION) {
  run("RELEASE myapp_temporary_savepoint");
}

While there is there some overhead in constantly creating and tearing down so many savepoints, it is quite small, especially if you are using it in an interactive session. This ability to automatically roll things back is especially powerful when you remember that Postgres can roll everything back, including DDL (e.g. CREATE TABLE). Certain other expensive database systems do not play well when mixing DDL and transactions.


published by Eugenia on 2015-02-21 18:05:18 in the "Filmmaking" category
Eugenia Loli-Queru

Horror-action short movie, that I shot using a small Canon S110.


published by noreply@blogger.com (wojciechz) on 2015-02-18 13:26:00 in the "BATS" category
All Liquid Galaxy setups deployed by End Point are managed by Chef. Typical deployment consists of approx 3 to 9 Linux boxes from which only 1 is managed and the rest is an ISO booted from this machine via network with copy-on-write root filesystem. Because of this, typical deployment involves more steps than just updating your code and restarting application. Deployment + rollback may be even 10 times longer compared with typical web application. Due to this fact, we need to test our infrastructure extensively.

What are we to do in order to make sure that our infrastructure is tested well before it hits production?


Scary? - It's not.


Workflow broken down by pieces



  • lg_chef.git repo - where we keep cookbooks, environments and node definitions
  • GitHub pull request - artifact of infrastructure source code tested by Jenkins
  • Vagrant - virtual environment in which chef is run in order to test the artifact. There's always 1 master node and few Vagrant boxes that boot an ISO from master via tftp protocol
  • chef-zero - Chef flavor used to converge and test the infrastructure on the basis of GitHub pull request
  • chef-server/chef-client - Chef flavor used to converge and test production and pre-production environment
  • Jenkins - Continuous Integration environment that runs the converge process and part of the tests
  • Tests - two frameworks used - BATS (for the integration tests on the top) and minitest (for after-converge tests)
  • lg-live-build - our fork of Debian live build used to build the ISO that is booted by Vagrant slaves

Workflow broken down by the order of actions

  1. User submits GitHub pull request to lg_chef.git repo
  2. GitHub pull request gets picked up by Jenkins
  3. Jenkins creates 1 master Vagrant node and several slave nodes to act as slaves
  4. chef-zero converges master Vagrant box and runs minitest
  5. BATS tests run on the freshly converged Vagrant master box. Few steps are performed here: ISO is built, it's distributed to the slaves, slaves boot the ISO and final integration tests are run to see whether slaves have all the goodness.
  6. If points 1 to 5 are green, developer merges the changes, uploads the updated cookbooks, node definitions, roles and environments and runs the final tests.

What didn't work for us and why

  • kitchen-vagrant  - because it didn't play well with Jenkins (or JVM itself) and didn't know how to use advanced Vagrant features for specifying multiple networking options, interfaces and drivers. However it supports using your own Vagrantfile.erb
  • We've had some doubts about keeping all the cookbooks, environments and node definitions in one repo because chef-server/chef-client tests can only test your stuff if it's uploaded to the Chef server, but chef-zero came in handy

The code

As previously mentioned, we needed our own vagrant template file.

Vagrant.configure("2") do |config|
  <% if @data[:chef_type] == "chef_zero" %>
  config.chef_zero.enabled = true
  config.chef_zero.chef_server_url = "<%= @data[:chef_server_url] %>"
  config.chef_zero.roles = "../../roles/"
  config.chef_zero.cookbooks = "../../cookbooks/"
  config.chef_zero.environments = "../../environments/"
  config.chef_zero.data_bags = "../integration/default/data_bags/"
  <% else %>
  config.chef_zero.enabled = false
  <% end %>
  config.omnibus.chef_version = "<%= @data[:chef_version] %>"
  config.vm.define "<%= @data[:headnode][:slug] %>" do |h|
  h.vm.box = "<%= @data[:headnode][:box] %>"
    h.vm.box_url = "<%= @data[:headnode][:box_url] %>"
    h.vm.hostname = "<%= @data[:headnode][:hostname] %>"
    h.vm.network(:private_network, {:ip => '10.42.41.1'})
    h.vm.synced_folder ".", "/vagrant", disabled: true
    h.vm.provider :virtualbox do |p|
      <% @data[:headnode][:customizations].each do |key, value| %>
        p.customize ["modifyvm", :id, "<%= key %>", "<%= value %>"]
      <% end %>
    end
    h.vm.provision :chef_client do |chef|
      <% if @data[:chef_type] == "chef_zero" %>
      chef.environment = "<%= @data[:headnode][:provision][:environment] %>"
      chef.run_list = <%= @data[:run_list] %>
      chef.json = <%= @data[:node_definition] %>
      chef.chef_server_url = "<%= @data[:chef_server_url] %>"
      <% else %>
      chef.chef_server_url = "<%= @data[:headnode][:provision][:chef_server_url] %>"
      <% end %>
      chef.validation_key_path = "<%= @data[:headnode][:provision][:validation_key_path] %>"
      chef.encrypted_data_bag_secret_key_path = "<%= @data[:headnode][:provision][:encrypted_data_bag_secret_key_path] %>"
      chef.verbose_logging = <%= @data[:headnode][:provision][:verbose_logging] %>
      chef.log_level = "<%= @data[:headnode][:provision][:log_level] %>"
  <% end %>
  config.omnibus.chef_version = "<%= @data[:chef_version] %>"
  config.vm.define "<%= @data[:headnode][:slug] %>" do |h|
  h.vm.box = "<%= @data[:headnode][:box] %>"
    h.vm.box_url = "<%= @data[:headnode][:box_url] %>"
    h.vm.hostname = "<%= @data[:headnode][:hostname] %>"
    h.vm.network(:private_network, {:ip => '10.42.41.1'})
    h.vm.synced_folder ".", "/vagrant", disabled: true
    h.vm.provider :virtualbox do |p|
      <% @data[:headnode][:customizations].each do |key, value| %>
        p.customize ["modifyvm", :id, "<%= key %>", "<%= value %>"]
      <% end %>
    end
    h.vm.provision :chef_client do |chef|
      <% if @data[:chef_type] == "chef_zero" %>
      chef.environment = "<%= @data[:headnode][:provision][:environment] %>"
      chef.run_list = <%= @data[:run_list] %>
      chef.json = <%= @data[:node_definition] %>
      chef.chef_server_url = "<%= @data[:chef_server_url] %>"
      <% else %>
      chef.chef_server_url = "<%= @data[:headnode][:provision][:chef_server_url] %>"
      <% end %>
      chef.validation_key_path = "<%= @data[:headnode][:provision][:validation_key_path] %>"
      chef.encrypted_data_bag_secret_key_path = "<%= @data[:headnode][:provision][:encrypted_data_bag_secret_key_path] %>"
      chef.verbose_logging = <%= @data[:headnode][:provision][:verbose_logging] %>
      chef.log_level = "<%= @data[:headnode][:provision][:log_level] %>"
      chef.node_name = "<%= @data[:headnode][:provision][:node_name] %>"
    end
  end

  #display nodes
  <% @data[:display_nodes][:nodes].each do |dn| %>
  config.vm.define "<%= dn[:slug] %>" do |dn_config|
    dn_config.vm.box_url = "<%= @data[:display_nodes][:global][:box_url] %>"
    dn_config.vm.hostname = "<%= dn[:hostname] %>"
    dn_config.vm.box = "<%= @data[:display_nodes][:global][:box] %>"
    dn_config.vm.synced_folder ".", "/vagrant", disabled: true
    dn_config.vm.boot_timeout = 1
    dn_config.vm.provider :virtualbox do |p|
    <% @data[:display_nodes][:global][:customizations].each do |key, value| %>
      p.customize ["modifyvm", :id, "<%= key %>", "<%= value %>"]
    <% end %>
      p.customize ["modifyvm", :id, "--macaddress1", "<%= dn[:mac] %>"]
      p.customize ["createhd", "--filename", "../files/<%= dn[:slug] %>.vmdk", "--size", 80*1024]
      p.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "hdd", "--medium", "none"]
      p.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "hdd", "--medium", "../files/<%= dn[:slug] %>.vmdk"]
      p.customize ["storagectl", :id, "--name", "SATA Controller", "--add", "sata",  "--controller", "IntelAHCI", "--hostiocache", "on"]
      p.customize ["storageattach", :id, "--storagectl", "SATA Controller", "--port", 1, "--device", 0, "--type", "hdd", "--medium", "../files/ipxe_<%= dn[:slug] %>.vmdk"] 
    end
  end
  <% end %>
end

It renders a Vagrant File out of following data:

{
  "description" : "This file is used to generate Vagrantfile and run_test.sh and also run tests. It should contain _all_ data needed to render the templates and run teh tests.",
  "chef_version" : "11.12.4",
  "chef_type" : "chef_zero",
  "vagrant_template_file" : "vagrantfile.erb",
  "run_tests_template_file" : "run_tests.sh.erb",
  "chef_server_url" : "http://192.168.1.2:4000",
  "headnode" :
    {
    "slug" : "projectX-pull-requests",
    "box" : "opscode-ubuntu-14.04",
    "box_url" : "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box",
    "hostname" : "lg-head",
    "bats_tests_dir" : "projectX-pr",
    "customizations" : {
      "--memory" : "2048",
      "--cpus": "2",
      "--nic1" : "nat",
      "--nic2": "intnet",
      "--nic3": "none",
      "--nic4" : "none",
      "--nictype1": "Am79C970A",
      "--nictype2": "Am79C970A",
      "--intnet2": "projectX-pull-requests"
    },
    "provision" : {
      "chef_server_url" : "https://chefserver.ourdomain.com:40443",
      "validation_key_path" : "~/.chef/validation.pem",
      "encrypted_data_bag_secret_key_path" : "~/.chef/encrypted_data_bag_secret",
      "node_name" : "lg-head-projectXtest.liquid.glx",
      "environment" : "pull_requests",
      "verbose_logging" : true,
      "log_level" : "info"
    }
  },
  "display_nodes" : {
    "global" : {
      "box" : "opscode-ubuntu-14.04",
      "box_url" : "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box",
      "customizations" : {
        "--memory" : "2048",
        "--cpus" : "1",
        "--boot1" : "floppy",
        "--boot2" : "net",
        "--boot3" : "none",
        "--boot4" : "none"
    },
    "provision" : {
      "chef_server_url" : "https://chefserver.ourdomain.com:40443",
      "validation_key_path" : "~/.chef/validation.pem",
      "encrypted_data_bag_secret_key_path" : "~/.chef/encrypted_data_bag_secret",
      "node_name" : "lg-head-projectXtest.liquid.glx",
      "environment" : "pull_requests",
      "verbose_logging" : true,
      "log_level" : "info"
    }
  },
  "display_nodes" : {
    "global" : {
      "box" : "opscode-ubuntu-14.04",
      "box_url" : "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box",
      "customizations" : {
        "--memory" : "2048",
        "--cpus" : "1",
        "--boot1" : "floppy",
        "--boot2" : "net",
        "--boot3" : "none",
        "--boot4" : "none",
        "--intnet1" : "projectX-pull-requests",
        "--nicpromisc1": "allow-all",
        "--nic1" : "intnet",
        "--nic2": "none",
        "--nic3": "none",
        "--nic4": "none",
        "--nictype1": "Am79C970A",
        "--ioapic": "on"
      }
    },
    "nodes" : [
      {
      "slug" : "projectX-pull-requests-kiosk",
      "hostname" : "kiosk",
      "mac" : "5ca1ab1e0001"
    },
    {
      "slug" : "projectX-pull-requests-display",
      "hostname" : "display",
      "mac" : "5ca1ab1e0002"
    }
    ]
  }
}

As a result we get an on-the-fly Vagrantfile that's used during the testing:

Vagrant.configure("2") do |config|

  config.chef_zero.enabled = true
  config.chef_zero.chef_server_url = "http://192.168.1.2:4000"
  config.chef_zero.roles = "../../roles/"
  config.chef_zero.cookbooks = "../../cookbooks/"
  config.chef_zero.environments = "../../environments/"
  config.chef_zero.data_bags = "../integration/default/data_bags/"

  config.omnibus.chef_version = "11.12.4"
  config.vm.define "projectX-pull-requests" do |h|
  h.vm.box = "opscode-ubuntu-14.04"
    h.vm.box_url = "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box"
    h.vm.hostname = "lg-head"
    h.vm.network(:private_network, {:ip => '10.42.41.1'})
    h.vm.synced_folder ".", "/vagrant", disabled: true
    h.vm.provider :virtualbox do |p|
        p.customize ["modifyvm", :id, "--memory", "2048"]
        p.customize ["modifyvm", :id, "--cpus", "2"]
        p.customize ["modifyvm", :id, "--nic1", "nat"]
        p.customize ["modifyvm", :id, "--nic2", "intnet"]
        p.customize ["modifyvm", :id, "--nic3", "none"]
        p.customize ["modifyvm", :id, "--nic4", "none"]
        p.customize ["modifyvm", :id, "--nictype1", "Am79C970A"]
        p.customize ["modifyvm", :id, "--nictype2", "Am79C970A"]
        p.customize ["modifyvm", :id, "--intnet2", "projectX-pull-requests"]
    end
    h.vm.provision :chef_client do |chef|

      chef.environment = "pull_requests"
      chef.run_list = ["role[lg-head-nocms]", "recipe[lg_live_build]", "recipe[lg_tftproot]", "recipe[lg_projectX]", "recipe[lg_test]", "recipe[test_mode::bats]", "recipe[hostsfile::projectX]"]
      chef.json = {:sysctl=>{:params=>{:vm=>{:swappiness=>20}, :net=>{:ipv4=>{:ip_forward=>1}}}}, :test_mode=>true, :call_ep=>{:keyname=>"ProjectX CI Production", :fwdport=>33299}, :tftproot=>{:managed=>true}, :tags=>[], :lg_cms=>{:remote=>"github"}, :monitor=>true, :lg_grub=>{:cmdline=>"nomodeset biosdevname=0"}, :projectX=>{:repo_branch=>"development", :display_host=>"42-a", :kiosk_host=>"42-b", :sensors_host=>"42-b", :maps_url=>"https://www.google.com/maps/@8.135687,-75.0973243,17856994a,40.4y,1.23h/data=!3m1!1e3?esrch=Tactile::TactileAcme,Tactile::ImmersiveModeEnabled"}, :liquid_galaxy=>{:touchscreen_link=>"/dev/input/lg_active_touch", :screenshotd=>{:screen_rows=>"1", :screen_columns=>"1"}, :screenshot_service=>true, :display_nodes=>[{:hostname=>"42-c"}, {:allowed_pages=>["Google Maps", "Pacman Doodle", "jellyfish", "Doodle Selection", "ProjectX Video Player", "Composer kiosk"], :hostname=>"42-b", :mac=>"5c:a1:ab:1e:00:02", :features=>"mandatory_windows, plain_gray, starry_skies", :bad_windows_names=>"Google Earth - Login Status", :mandatory_windows_names=>"awesome", :screens=>[{:display=>":0", :crtc=>"default", :grid_order=>"0"}], :screen_rotation=>"normal", :audio_device=>"{type hw; card DGX; device 0}", :onboard_enable=>true, :keyboard_enable=>true, :mouse_enable=>true, :cursor_enable=>true, :background_extension=>"jpg", :background_mode=>"zoom-fill", :projectX=>{:extensions=>{:kiosk=>"ProjectX Kiosk", :google_properties_menu=>"Google Properties Menu", :onboard=>"Onboard", :no_right_click=>"Right Click Killer", :render_statistics=>"Render Statistics"}, :browser_slug=>"lgS0", :urls=>"https://www.google.com/maps", :ros_nodes=>[{:name=>"rfreceiver_reset", :pkg=>"rfreceiver", :type=>"kill_browser.py"}, {:name=>"proximity", :pkg=>"maxbotix", :type=>"sender.py"}, {:name=>"spacenav", :pkg=>"spacenav_node", :type=>"spacenav_node"}, {:name=>"leap", :pkg=>"leap_motion", :type=>"sender.py"}, {:name=>"projectX_nav", :pkg=>"projectX_nav", :type=>"projectX_nav"}, {:name=>"onboard", :pkg=>"onboard", :type=>"listener.py"}, {:name=>"rosbridge", :pkg=>"rosbridge_server", :type=>"rosbridge_websocket", :params=>[{:name=>"certfile", :value=>"/home/lg/etc/ros.crt"}, {:name=>"keyfile", :value=>"/home/lg/etc/ros.key"}]}]}, :browser_infinite_url=>"http://lg-head/projectX-loader.html"}, {:hostname=>"42-a", :mac=>"5c:a1:ab:1e:00:01", :features=>"mandatory_windows, plain_gray, starry_skies, erroneous_text", :bad_windows_names=>"Google Earth - Login Status", :mandatory_windows_names=>"awesome", :screens=>[{:display=>":0", :crtc=>"default", :grid_order=>"1"}], :keyboard_enable=>true, :mouse_enable=>true, :cursor_enable=>true, :background_extension=>"jpg", :background_mode=>"zoom-fill", :nvidia_mosaic=>true, :manual_layout=>{:default=>"1024x768+0+0"}, :projectX=>{:extensions=>{:display=>"ProjectX Large Display", :pacman=>"pacman", :render_statistics=>"Render Statistics"}, :browser_slug=>"lgS0", :urls=>"https://www.google.com/maps", :ros_nodes=>[{:name=>"geodata", :pkg=>"geodata", :type=>"geodata_server.py"}]}, :browser_infinite_url=>"http://lg-head/projectX-loader.html", :default_browser_bin=>"google-chrome", :allowed_pages=>["Google Maps", "Pacman Doodle", "jellyfish", "ProjectX Video Player", "Composer wall"]}], :has_cec=>false, :google_office=>false, :viewsync_master=>"42-b", :has_touchscreen=>false, :has_spacenav=>true, :support_name=>"projectX-ci", :podium_interface=>"http://lg-head", :podium_display=>"42-b:default"}}
      chef.chef_server_url = "http://192.168.1.2:4000"

      chef.validation_key_path = "~/.chef/validation.pem"
      chef.encrypted_data_bag_secret_key_path = "~/.chef/encrypted_data_bag_secret"
      chef.verbose_logging = true
      chef.log_level = "info"
      chef.node_name = "lg-head-projectXtest.liquid.glx"
    end
  end

  #display nodes

  config.vm.define "projectX-pull-requests-kiosk" do |dn_config|
    dn_config.vm.box_url = "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box"
    dn_config.vm.hostname = "kiosk"
    dn_config.vm.box = "opscode-ubuntu-14.04"
    dn_config.vm.synced_folder ".", "/vagrant", disabled: true
    dn_config.vm.boot_timeout = 1
    dn_config.vm.provider :virtualbox do |p|
      p.customize ["modifyvm", :id, "--memory", "2048"]
      p.customize ["modifyvm", :id, "--cpus", "1"]
      p.customize ["modifyvm", :id, "--boot1", "floppy"]
      p.customize ["modifyvm", :id, "--boot2", "net"]
      p.customize ["modifyvm", :id, "--boot3", "none"]
      p.customize ["modifyvm", :id, "--boot4", "none"]
      p.customize ["modifyvm", :id, "--intnet1", "projectX-pull-requests"]
      p.customize ["modifyvm", :id, "--nicpromisc1", "allow-all"]
      p.customize ["modifyvm", :id, "--nic1", "intnet"]
      p.customize ["modifyvm", :id, "--nic2", "none"]
      p.customize ["modifyvm", :id, "--nic3", "none"]
      p.customize ["modifyvm", :id, "--nic4", "none"]
      p.customize ["modifyvm", :id, "--nictype1", "Am79C970A"]
      p.customize ["modifyvm", :id, "--ioapic", "on"]
      p.customize ["modifyvm", :id, "--macaddress1", "5ca1ab1e0001"]
      p.customize ["createhd", "--filename", "../files/projectX-pull-requests-kiosk.vmdk", "--size", 80*1024]
      p.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "hdd", "--medium", "none"]
      p.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "hdd", "--medium", "../files/projectX-pull-requests-kiosk.vmdk"]
      p.customize ["storagectl", :id, "--name", "SATA Controller", "--add", "sata",  "--controller", "IntelAHCI", "--hostiocache", "on"]
      p.customize ["storageattach", :id, "--storagectl", "SATA Controller", "--port", 1, "--device", 0, "--type", "hdd", "--medium", "../files/ipxe_projectX-pull-requests-kiosk.vmdk"]
    end
  end

  config.vm.define "projectX-pull-requests-display" do |dn_config|
    dn_config.vm.box_url = "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box"
    dn_config.vm.hostname = "display"
    dn_config.vm.box = "opscode-ubuntu-14.04"
    dn_config.vm.synced_folder ".", "/vagrant", disabled: true
    dn_config.vm.boot_timeout = 1
    dn_config.vm.provider :virtualbox do |p|
      p.customize ["modifyvm", :id, "--memory", "2048"]
      p.customize ["modifyvm", :id, "--cpus", "1"]
      p.customize ["modifyvm", :id, "--boot1", "floppy"]
      p.customize ["modifyvm", :id, "--boot2", "net"]
      p.customize ["modifyvm", :id, "--boot3", "none"]
      p.customize ["modifyvm", :id, "--boot4", "none"]
      p.customize ["modifyvm", :id, "--intnet1", "projectX-pull-requests"]
      p.customize ["modifyvm", :id, "--nicpromisc1", "allow-all"]
      p.customize ["modifyvm", :id, "--nic1", "intnet"]
      p.customize ["modifyvm", :id, "--nic2", "none"]
      p.customize ["modifyvm", :id, "--nic3", "none"]
      p.customize ["modifyvm", :id, "--nic4", "none"]
      p.customize ["modifyvm", :id, "--nictype1", "Am79C970A"]
      p.customize ["modifyvm", :id, "--ioapic", "on"]
      p.customize ["modifyvm", :id, "--macaddress1", "5ca1ab1e0002"]
      p.customize ["createhd", "--filename", "../files/projectX-pull-requests-display.vmdk", "--size", 80*1024]
      p.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "hdd", "--medium", "none"]
      p.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "hdd", "--medium", "../files/projectX-pull-requests-display.vmdk"]
      p.customize ["storagectl", :id, "--name", "SATA Controller", "--add", "sata",  "--controller", "IntelAHCI", "--hostiocache", "on"]
      p.customize ["storageattach", :id, "--storagectl", "SATA Controller", "--port", 1, "--device", 0, "--type", "hdd", "--medium", "../files/ipxe_projectX-pull-requests-display.vmdk"]
    end
  end

end

Finally we have the environment stored in one Vagrantfile. The missing part is to how to run tests on it.
The testing script does the following:

#/bin/bash
set -e
# FUNCTIONS

function halt_vm () {
  vms=`vboxmanage list vms | grep "vagrant_$1_" | awk {'print $1'} | sed s/'"'//g`
  echo "Stopping VM $vms"
  stop_result=$(for vm in $vms ; do vboxmanage controlvm $vm poweroff; echo $?; done)
  echo "Output of stopping VM $1 : $stop_result"
}

function boot_vm () {
  vms=`vboxmanage list vms | grep "vagrant_$1_" | awk {'print $1'} | sed s/'"'//g`
  echo "Booting VM $vms"
  start_result=$(for vm in $vms ; do vboxmanage startvm $vm --type headless; echo $?; done)
  echo "Output of booting VM $1 : $start_result"
  echo "Sleeping additional 15 secs after peacefull boot"
  sleep 15
}

function add_keys () {
  for i in `find /var/lib/jenkins/.ssh/id_rsa* | grep -v '.pub'` ; do ssh-add $i ; done
}

#vars

knifeclient_name=lg-head-projectXtest.liquid.glx
headnode_name=projectX-pull-requests

# TEST SCENARIO

cd test/vagrant

# teardown of previous sessions
vagrant destroy projectX-pull-requests-kiosk -f
vagrant destroy projectX-pull-requests-display -f
vagrant destroy $headnode_name -f

echo "Not managing knife client because => chef_zero "
echo "All ssh keys presented below"
ssh-add -l

# headnode
vagrant up ${headnode_name}

# displaynodes

result=$(vboxmanage convertfromraw ../files/ipxe.usb ../files/ipxe_projectX-pull-requests-kiosk.vmdk --format=VMDK ; vagrant up projectX-pull-requests-kiosk; echo $?)
echo "projectX-pull-requests-kiosk : $result"

result=$(vboxmanage convertfromraw ../files/ipxe.usb ../files/ipxe_projectX-pull-requests-display.vmdk --format=VMDK ; vagrant up projectX-pull-requests-display; echo $?)
echo "projectX-pull-requests-display : $result"



# test phase
OPTIONS=`vagrant ssh-config  ${headnode_name} | grep -v ${headnode_name} | awk -v ORS=' ' '{print "-o " $1 "=" $2}'`
scp ${OPTIONS} ../integration/projectX-pr/bats/*.bats vagrant@${headnode_name}:/tmp/bats_tests

ssh ${OPTIONS} ${headnode_name} '/usr/local/bin/bats /tmp/bats_tests/pre_build_checks.bats'

halt_vm projectX-pull-requests-kiosk
halt_vm projectX-pull-requests-display

echo "Building teh ISO (it may take a long time)"
ssh ${OPTIONS} ${headnode_name} '/usr/local/bin/bats /tmp/bats_tests/build_iso.bats'

ssh ${OPTIONS} ${headnode_name} '/usr/local/bin/bats /tmp/bats_tests/set_grub_to_make_partitions.bats'

echo "Booting nodes"


boot_vm projectX-pull-requests-kiosk
boot_vm projectX-pull-requests-display


echo "Sleeping 30 secs for the DNS to boot and setting the grub to boot the ISO"
sleep 30

ssh ${OPTIONS} ${headnode_name} '/usr/local/bin/bats /tmp/bats_tests/set_grub_to_boot_the_iso.bats'
echo "Sleeping for 4 mins for the displaynodes to boot fresh ISO"
sleep 240

echo "Running the tests inside the headnode:"

ssh ${OPTIONS} ${headnode_name} '/usr/local/bin/bats /tmp/bats_tests/post_checks.bats'
So finally we get the following pipeline:

  1. Clone Chef pull request from GitHub
  2. Create Vagrantfile on the basis of Vagrantfile template
  3. Create run_tests.sh script for running the tests
  4. Destroy all previously created Vagrant boxes
  5. Create one Chef Vagrant box
  6. Create ISO Vagrant boxes with ipxe bootloader
  7. Converge the Vagrant box with Chef
  8. Copy BATS tests onto the headnode
  9. Run initial BATS tests that build an ISO
  10. Boot display nodes with the newly created ISO
  11. Run final integration tests on the stack
Elapsed time - between 40 and 50 minutes.

published by noreply@blogger.com (Greg Sabino Mullane) on 2015-02-16 18:59:00 in the "database" category

One of the many reasons I love Postgres is the responsiveness of the developers. Last week I posted an article about the dangers of reinstating some implicit data type casts. Foremost among the dangers was the fact that pg_dump will not dump user-created casts in the pg_catalog schema. Tom Lane (eximious Postgres hacker) read this and fixed it up - the very same day! So in git head (which will become Postgres version 9.5 someday) we no longer need to worry about custom casts disappearing with a pg_dump and reload. These same-day fixes are not an unusual thing for the Postgres project.

For due diligence, let's make sure that the casts now survive a pg_dump and reload into a new database via psql:


psql -qc 'drop database if exists casting_test'
psql -qc 'create database casting_test'
psql casting_test -xtc 'select 123::text = 123::int'
ERROR:  operator does not exist: text = integer
LINE 1: select 123::text = 123::int
                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
psql casting_test -c 'create function pg_catalog.text(int) returns text immutable language sql as $$select textin(i
nt4out($1))$$'
CREATE FUNCTION

psql casting_test -c 'create cast (int as text) with function pg_catalog.text(int) as implicit'
CREATE CAST

psql casting_test -xtc 'select 123::text = 123::int'
 ?column? | t

psql -qc 'drop database if exists casting_test2'
psql -qc 'create database casting_test2'
pg_dump casting_test | psql -q casting_test2
psql casting_test2 -xtc 'select 123::text = 123::int'
 ?column? | t

Yay, it works! Thanks, Tom, for commit 9feefedf9e92066fa6609d1e1e17b4892d81716f). The fix even got back-patches, which means it will appear in Postgres version 9.5, but also versions 9.4.2, 9.3.7, 9.2.11, 9.1.16, and 9.0.20. However, does this mean that pg_dump is logically complete, or are there similar dangers lurking like eels below the water in the source code for pg_dump? You will be happy to learn that I could find no other exceptions inside of src/bin/pg_dump/pg_dump.c. While there are still many place in the code where an object can be excluded, it's all done for valid and expected reasons, such as not dumping a table if the schema it is in is not being dumped as well.


published by noreply@blogger.com (Steph Skardal) on 2015-02-10 16:30:00 in the "company" category

Today, I sat down to read through a few recent End Point blog articles and was impressed at the depth of topics in recent posts (PostgreSQL, Interchange, SysAdmin, Text Editors (Vim), Dancer, AngularJS) from my coworkers. The list continues if I look further back covering technologies in both front and back end web development. And, this list doesn't even cover the topics I typically write about such as Ruby on Rails & JavaScript.

While 5 years ago, we may have said we predominately worked with ecommerce clients, our portfolio has evolved to include Liquid Galaxy clients and many non-ecommerce sites as well. With the inspiration from reading through these recent posts, I decided to share some updated stats.

Do you remember my post on Wordle from early 2011? Wordle is a free online word cloud generator. I grabbed updated text from 2013 and on from our blog, using the code included in my original post, and generated a new word cloud from End Point blog content:


End Point blog Word cloud from 2013 to present

I removed common words from the word cloud not removed from the original post, including "one", "like", etc. Compared to the original post, it looks like database related topics (e.g. PostgreSQL) still have strong representation on the blog in terms of word count, and many other common developer words. Liquid Galaxy now shows up in the word cloud (not surprising), but many of the other technology specific terms are still present (Spree, Rails, Bucardo).

I also took a look at the top 10 blog posts by page views, as compared to this post:

The page views are not normalized over time, which means older blog posts would not only have more page views, but also have more time to build up traffic from search. Again, this list demonstrates qualitatively the broad range of topics for which our blog is popular, including both very technology specific posts as well as general development topics. I also suspect our traffic continues to attract long-tail keywords, described more in this post.

Finally, back in October, I visited End Point's Tennessee office and got into a discussion with Jon about how we define our services and/or how our business breaks down into topics. Here's a rough chart of what we came up with at the time:


How do End Point services break down?

Trying to explain the broad range and depth of our services can be challenging. Here are a few additional notes related to the pie chart:

  • Our Liquid Galaxy work spans across the topics of Hardware & Hosting, Cloud Systems, and Databases.
  • Our Ecommerce services typically includes work in the topics of Backend & Client Side Development, as well as Databases.
  • Our development in mobile applications spans Backend & Client Side Development.

All in all, I'm impressed that we've continued to maintain expertise in long-standing topics such as PostgreSQL and Interchange, but also haven't shied away from learning new technologies such as GIS as related to Liquid Galaxy and JavaScript frameworks.

PS

P.S. If you are interested in generating word statistics via command line, the following will get you the top 20 words given a text file:

tr -c '[:alnum:]' '[n*]' < some_text_file.txt | sort | uniq -c | sort -nr | head  -20

published by noreply@blogger.com (Greg Sabino Mullane) on 2015-02-10 13:00:00 in the "database" category

We recently upgraded a client from Postgres version 8.3 to version 9.4. Yes, that is quite the jump! In the process, I was reminded about the old implicit cast issue. A major change of Postgres 8.3 was the removal of some of the built-in casts, meaning that many applications that worked fine on Postgres 8.2 and earlier started throwing errors. The correct response to fixing such things is to adjust the underlying application and its SQL. Sometimes this meant a big code difference. This is not always possible because of the size and/or complexity of the code, or simply the sheer inability to change it for various other reasons. Thus, another solution was to add some of the casts back in. However, this has its own drawback, as seen below.

While this may seem a little academic, given how old 8.3 is, we still see that version in the field. Indeed, we have more than a few clients running versions even older than that! While pg_upgrade is the preferred method for upgrading between major versions (even upgrading from 8.3), its use is not always possible. For the client in this story, in addition to some system catalog corruption, we wanted to move to using data checksums. A logical dump via pg_dump was therefore a better choice.

The implicit casts can be added back in via a two-step approach of adding a support function, and then a new cast that uses that function to bind two data types. The canonical list of "missing" casts can be found at this blog post by Peter Eisentraut. The first rule of adding back in implicit casts is "don't do it, fix your code instead". The second rule is "only add the bare minimum needed to get your application working". The basic format for re-adding the casts is:

create function pg_catalog.text(int) returns text immutable language sql as $$select textin(int4out($1))$$

create cast (int as text) with function pg_catalog.text(int) as implicit

Once we got the pg_dump and import working from version 8.3 to 9.4, some old but familiar errors started popping up that looked like this:

ERROR:  operator does not exist: text = bigint at character 32
HINT: No operator matches the given name and argument type(s). You might need to
add explicit casts.

This was quickly fixed by applying the FUNCTION and CAST from above, but why did we have to apply it twice (the original, and after the migration)? The reason is that pg_dump does *NOT* dump custom casts. Yes, this is a bit surprising as pg_dump is supposed to write out a complete logical dump of the database, but casts are a specific exception. Not all casts are ignored by pg_dump - only if both sides of the cast are built-in data types, and everything is in the pg_catalog namespace. It would be nice if this were fixed someday, such that *any* user-created objects are dumped, regardless of their namespace.

There is a way around this, however, and that is to create the function and cast in another namespace. When this is done, pg_dump *WILL* dump the casts. The drawback is that you must ensure the function and schema are available to everyone. By default, functions are available to everyone, so unless you go crazy with REVOKE commands, you should be fine. The nice thing about pg_catalog is that the schema is not likely to get dropped :). Being able to dump the added casts has another advantage: creating copies of the database via pg_dump for QA or testing will always work.

So, there is a hard choice when creating custom casts (and this applies to all custom casts, not just the ones to fix the 8.3 implicit cast mess). You can either create your casts inside of pg_catalog, which ensures they are available to all users, but cannot be pg_dumped. Thus, you will need to reapply them anytime you make a copy of the database via a pg_dump (including backups!). Or, you can create them in another schema (e.g. "public"), which means that the function must be executable to everyone, but that you can pg_dump them. I really dislike pg_dump breaking its contract, and lean towards the public schema solution when possible.

Here's a demonstration of the problem and each solution. This is using a Postgres 9.4 instance, and a very simple query to illustrate the problem, using the TEXT datatype and the INT datatype. First, let's create a brand new database and demonstrate the issue:

psql -c 'drop database if exists casting_test'
NOTICE:  database "casting_test" does not exist, skipping
DROP DATABASE

psql -c 'create database casting_test'
CREATE DATABASE

psql casting_test -xtc 'select 123::text = 123::int'
ERROR:  operator does not exist: text = integer
LINE 1: select 123::text = 123::int
                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

Now we will fix it by creating a new cast and a supporting function for it. The error disappears. We also confirm that copying the database by using CREATE DATABASE .. TEMPLATE copies our new casts as well:

psql casting_test -c 'create function pg_catalog.text(int) returns text immutable language sql as $$select textin(int4out($1))$$'
CREATE FUNCTION

psql casting_test -c 'create cast (int as text) with function pg_catalog.text(int) as implicit'
CREATE CAST

psql casting_test -xtc 'select 123::text = 123::int'
 ?column? | t

psql -c 'create database clone template casting_test'
CREATE DATABASE

psql clone -xtc 'select 123::text = 123::int'
 ?column? | t

Now let's see how pg_dump fails us:

psql -qc 'drop database if exists casting_test2'
psql -qc 'create database casting_test2'
pg_dump casting_test | psql -q casting_test2
psql casting_test2 -xtc 'select 123::text = 123::int'
ERROR:  operator does not exist: text = integer
LINE 1: select 123::text = 123::int
                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

Now let's try it again, this time by putting things into the public schema:

psql -qc 'drop database if exists casting_test'
psql -qc 'create database casting_test'
psql casting_test -xtc 'select 123::text = 123::int'
ERROR:  operator does not exist: text = integer
LINE 1: select 123::text = 123::int
                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
psql casting_test -c 'create function public.text(int) returns text immutable language sql as $$select textin(int4out($1))$$'
CREATE FUNCTION

psql casting_test -c 'create cast (int as text) with function public.text(int) as implicit'
CREATE CAST

psql casting_test -xtc 'select 123::text = 123::int'
 ?column? | t

psql -qc 'drop database if exists casting_test2'
psql -qc 'create database casting_test2'
pg_dump casting_test | psql -q casting_test2
psql casting_test2 -xtc 'select 123::text = 123::int'
 ?column? | t

So why does it succeed the second time when using the public schema? By creating the function in a "non-pg" namespace, pg_dump will now dump it and the cast that uses it. This rule is set out in the file src/bin/pg_dump/pg_dump.c, with a source code comment stating:

/*
* As per discussion we dump casts if one or more of the underlying
* objects (the conversion function and the data types) are not
* builtin AND if all of the non-builtin objects namespaces are
* included in the dump. Builtin meaning, the namespace name does not
* start with "pg_".
*/

The moral of the story here is to avoid re-adding the implicit casts if at all possible, for it causes a ripple effect of woes. If you do add them, add only the ones you really need, only add them to the databases that need them, and consider using the public schema, not pg_catalog, for the new function. Remember that you can only fix this per database, so any new databases that get created or used by your application will need them applied. As a final blow against using them, the string concatenation operator will probably start giving you new errors if you try to combine any of the data type combinations used in your custom casts!


published by noreply@blogger.com (Mark Johnson) on 2015-02-09 16:58:00 in the "ecommerce" category

It's important to understand both how loops work in Interchange and the (very) fundamental differences between interpolating Interchange tag language (ITL) and the special loop tags (typically referred to as [PREFIX-*] in the literature). Absent this sometimes arcane knowledge, it is very easy to get stuck with inefficient loops even with relatively small loop sets. I'll discuss both the function of loops and interpolation differences between the tag types while working through a [query] example. While all loop tags--[item-list], [loop], [search-list], and [query]--process similarly, it is to [query] where most complex loops will gravitate over time (to optimize the initiation phase of entering the loop) and where we have the most flexibility for coming up with alternative strategies to mitigate sluggish loop-processing.

Loop Processing

All loop tags are container tags in Interchange, meaning they have an open and close tag, and in between is the body. Only inside this body is it valid to define [PREFIX-*] tags (notable exception of [PREFIX-quote] for the sql arg of [query]). This is because the [PREFIX-*] tags are not true ITL. They are tightly coupled with the structure of the underlying rows of data and they are processed by distinct, optimized regular expressions serially. Outside the context of the row data from a result set, they are meaningless.

Moreover, the body of a loop tag is slurped into a scalar variable (as all bodies of container tags are handled via the ITL parser) and for each row in the record set of the loop, the contents are acted upon according to the [PREFIX-*] tags defined within the body. The first important distinction to recognize here is, the per-row action on this scalar is limited to only the [PREFIX-*] tags. The action occurring at loop time ignores any embedded ITL.

At the end of each row's processing, the copy of the body tied to that one row is then concatenated to the results of all previous rows thus processed. For a loop with N rows (assuming no suppression by [if-PREFIX-*] conditionals) that means every instance of ITL originally placed into the loop body is now present N times in the fully assembled body string. Simple example:

[loop list='1 2 3']
[tmp junk][loop-code][/tmp]
[if scratch junk == 2]
I declare [loop-code] to be special!
[else]
Meh. [loop-code] is ordinary.
[/else]
[/if]
[/loop]

Once this result set with N=3 is processed, but before Interchange returns the results, the assembled return looks like the following string:

[tmp junk]1[/tmp]
[if scratch junk == 2]
I declare 1 to be special!
[else]
Meh. 1 is oridinary
[/else]
[/if]

[tmp junk]2[/tmp]
[if scratch junk == 2]
I declare 2 to be special!
[else]
Meh. 2 is oridinary
[/else]
[/if]

[tmp junk]3[/tmp]
[if scratch junk == 2]
I declare 3 to be special!
[else]
Meh. 3 is oridinary
[/else]
[/if]

Some important observations:

  • It doesn't take much ITL to turn a loop body into a monster interpolation process. One must consider the complexity of the ITL in the body by a factor of the number of rows (total, or the "ml" matchlimit value).

  • ITL does nothing to short-circuit action of the [PREFIX-*] tags. Having [PREFIX-param], [PREFIX-calc], etc. inside an ITL [if] means all those loop tags parse regardless of the truth of the if condition.

  • ITL vs. Loop Tags

    ITL maps to routines, both core and user-defined, determined at compile time. They are processed in order of discovery within the string handed to the ::interpolate_html() routine and have varied and complex attributes that must be resolved for each individual tag. Further, for many (if not most) tags, the return value is itself passed through a new call to ::interpolate_html(), acting on all embedded tags, in an action referred to as reparse. There is, relatively speaking, a good deal of overhead in processing through ::interpolate_html(), particularly with reparse potentially spawning off a great many more ::interpolate_html() calls.

    Loop tags, by contrast, map to a pre-compiled set of regular expressions. In contrast to crawling the string and acting upon the tags in the order of discovery, each regex in turn is applied globally to the string. The size of the string is limited to the exact size of the single loop body, and there is no analogue to ITL's reparse. Further, given this processing pattern, a careful observer might have noted that the order of operations can impact the structure. Specifically, tags processed earlier cannot depend on tags processed later. E.g., [PREFIX-param] processes ahead of [PREFIX-pos], and so:

    [if-PREFIX-pos 2 eq [PREFIX-param bar]]
    

    will work, but:

    [if-PREFIX-param bar eq [PREFIX-pos 2]]
    

    will not. While the above is a somewhat contrived exampled, the impacts of loop tag processing can be more easily seen in an example using [PREFIX-next]:

    [PREFIX-next][PREFIX-param baz][/PREFIX-next]
    Code I only want to run when baz is false, like this [PREFIX-exec foo][/PREFIX-exec] call
    

    Because [PREFIX-next] is the absolute last loop tag to run, every other loop tag in the block is run before the next condition is checked. All [PREFIX-next] does is suppress the resulting body from the return, unlike Perl's next, which short-circuits the remaining code in the loop block.

    An Optimization Example

    As long as you're familiar with the idiosyncracies of [PREFIX-*] tags, you should make every effort to use them instead of ITL because they are substantially lighter weight and faster to process. A classic case that can yield remarkable performance gains is to directly swap an embedded [perl] or [calc] block with an equivalent [PREFIX-calc] block.

    Let's take a typical query with little consideration given to whether we use loop tags or ITL, not unlike many I've seen where the resource has just become unusably slow. This code originally was developed processing 50 records per page view, but the team using it has requested over time to increase that count.

    [query
        list=1
        ml=500 [comment]Ouch! That's a big N[/comment]
        sql="
            SELECT *
            FROM transactions
            WHERE status = 'pending'
            ORDER BY order_date DESC
        "
    ]
    Order [sql-param order_number]
    [if global DEVELOPMENT]
        Show [sql-exec stats_crunch][sql-param stats][/sql-exec], only of interest to developers
    [/if]
    Date: [convert-date format="%b %d, %Y at %T"][sql-param order_date][/convert-date]
    [if cgi show_inventory]
    Inv:
    [if cgi show_inventory eq all]
    * Shipped: [either][sql-param is_shipped][or]pending[/either]
    * Count: [inventory type=shipped sku=[sql-param sku]]
    [/if]
    * On Hand: [inventory type=onhand sku=[sql-param sku]]
    * Sold: [inventory type=sold sku=[sql-param sku]]
    * Shipping: [inventory type=shipping sku=[sql-param sku]]
    [/if]
    Order details:
        <a href="[area
                    href=order_view
                    form="
                        order=[sql-param order_number]
                        show_status=[either][cgi show_status][or]active[/either]
                        show_inventory=[cgi show_inventory]
                    "
                ]">View [sql-param order_number]</a>
    [/query]
    

    Considering this block out of context, it doesn't seem all that unreasonable. However, let's look at some of the pieces individually and see what can be done.

  • We use [if] in 3 different circumstances in the block. However, those values they test are static. They don't change on any iteration. (We are excluding the potential of any other ITL present in the block from changing their values behind the scenes.)

  • [convert-date] may be convenient, but it is only one of a number of ways to address date formatting. Our database itself almost certainly has date-formatting routines, but one of the benefits of [convert-date] is you could have a mixed format underlying the data and it can make sense out of the date to some degree. So perhaps that's why the developer has used [convert-date] here.

  • Good chance that stats_crunch() is pretty complicated and that's why the developer wrote a catalog or global subroutine to handle it. Since we only want to see it in the development environment, it'd be nice if it only ran when it was needed. Right now, because of ITL happening on reparse, stats_crunch() fires for every row even if we have no intention of using its output.

  • We need that link to view our order, but on reparse it means ::interpolate_html() has to parse 500 [area] tags along with [either] and [cgi] x 500. All of these tags are lightweight, but parsing numbers are really going to catch up to us here.

  • Our goal here is to replace any ITL we can with an equivalent use of a loop tag or, absent the ability to remove ITL logically, to wrap that ITL into a subroutine that can itself be called in loop context with [PREFIX-exec]. The first thing I want to address are those [if] and [either] tags, the lowest hanging fruit:

    [query
        list=1
        ml=500
        sql="
            SELECT *,
                '[if global DEVELOPMENT]1[/if]' AS is_development,
                [sql-quote][cgi show_inventory][/sql-quote] AS show_inventory,
                COALESCE(is_shipped,'pending') AS show_inventory_shipped
            FROM transactions
            WHERE status = 'pending'
            ORDER BY order_date DESC
        "
    ]
    Order [sql-param order_number]
    [if-sql-param is_development]
        Show [sql-exec stats_crunch][sql-param stats][/sql-exec], only of interest to developers
    [/if-sql-param]
    Date: [convert-date format="%b %d, %Y at %T"][sql-param order_date][/convert-date]
    [if-sql-param show_inventory]
    Inv:
    [if-sql-param show_inventory eq all]
    * Shipped: [sql-param show_inventory_shipped]
    * Count: [inventory type=shipped sku=[sql-param sku]]
    [/if-sql-param]
    * On Hand: [inventory type=onhand sku=[sql-param sku]]
    * Sold: [inventory type=sold sku=[sql-param sku]]
    * Shipping: [inventory type=shipping sku=[sql-param sku]]
    [/if-sql-param]
    Order details:
        <a href="[area
                    href=order_view
                    form="
                        order=[sql-param order_number]
                        show_status=[either][cgi show_status][or]active[/either]
                        show_inventory=[cgi show_inventory]
                    "
                ]">View [sql-param order_number]</a>
    [/query]
    

    By moving those evaluations into the SELECT list of the query, we've reduced the number of interpolations to arrive at those static values to 1 or, in the case of the [either] tag, 0 as we've offloaded the calculation entirely to the database. If is_shipped could be something perly false but not null, we would have to adjust our field accordingly, but in either case could still be easily managed as a database calculation. Moreover, by swapping in [if-sql-param is_development] for [if global DEVELOPMENT], we have kept stats_crunch() from running at all when in the production environment.

    Next, we'll consider [convert-date]:

    Date: [convert-date format="%b %d, %Y at %T"][sql-param order_date][/convert-date]

    My first attempt would be to address this similarly to the [if] and [either] conditions, and try to render the formatted date from a database function as an aliased field. However, let's assume the underlying structure of the data varies and that's not easily accomplished, and we still want [convert-date]. Luckily, Interchange supports that same tag as a filter, and [PREFIX-filter] is a loop tag:

    Date: [sql-filter convert_date."%b %d, %Y at %T"][sql-param order_date][/sql-filter]

    [PREFIX-filter] is very handy to keep in mind as many transformation tags have a filter wrapper for them. E.g., [currency] -> [PREFIX-filter currency]. And if the one you're looking at doesn't, you can build your own, easily.

    Now to look at that [inventory] tag. The most direct approach assumes that the code inside [inventory] can be run in Safe, which often it can even if [inventory] is global. However, if [inventory] does run-time un-Safe things (such as creating an object) then it may not be possible. In such a case, we would want to create a global sub, like our hypothetical stats_crunch(), and invoke it via [PREFIX-exec]. However, let us assume we can safely (as it were) invoke it via the $Tag object to demonstrate another potent loop option: [PREFIX-sub].

    [if-sql-param show_inventory]
    [sql-sub show_inventory]
        my $arg = shift;
        return $Tag->inventory({ type => $arg, sku => $Row->{sku} });
    [/sql-sub]
    Inv:
    [if-sql-param show_inventory eq all]
    * Shipped: [sql-param show_inventory_shipped]
    * Count: [sql-exec show_inventory]shipped[/sql-exec]
    [/if-sql-param]
    * On Hand: [sql-exec show_inventory]on_hand[/sql-exec]
    * Sold: [sql-exec show_inventory]sold[/sql-exec]
    * Shipping: [sql-exec show_inventory]shipping[/sql-exec]
    [/if-sql-param]
    

    Let's go over what this gives us:

  • [PREFIX-sub] creates an in-line catalog sub that is compiled at the start of processing, before looping actually begins. As such, the [PREFIX-sub] definitions can occur anywhere within the loop body and are then removed from the body to be parsed.

  • The body of the [PREFIX-exec] is passed to the sub as the first argument. We use that here for our static values to the "type" arg. If we also wanted to access [sql-param sku] from the call, we would have to include that in the body and set up a parser to extract it out of the one (and only) arg we can pass in. Instead, we can reference the $Row hash within the sub body just as we can do when using a [PREFIX-calc], with one minor adjustment to our [query] tag--we have to indicate to [query] we are operating on a row-hash basis instead of the default row-array basis. We do that by adding the hashref arg to the list:
    [query
        list=1
        ml=500
        hashref=1
    

  • We still have access to the full functionality of [inventory] but we've removed the impact of having to parse that tag 2000 times (in the worst-case scenario) if left as ITL in the query body. If we run into Safe issues, that same sub body can either be created as a pre-compiled global sub or, if available, we can set our catalog AllowGlobal in which case catalog subs will no longer run under Safe.

  • Finally, all we have left to address is [area] and its args which themselves have ITL. I will leverage [PREFIX-sub] again as an easy way to manage the issue:

    [sql-sub area_order_view]
        my $show_status = $CGI->{show_status} || 'active';
        return $Tag->area({
            href => 'order_view',
            form => "order=$Row->{order_number}n"
                  . "show_status=$show_statusn"
                  . "show_inventory=$CGI->{show_inventory}",
        });
    [/sql-sub]
    Order details:
        <a href="[sql-exec area_order_view][/sql-exec]">View [sql-param order_number]</a>
    

    By packaging all of [area]'s requirements into the sub body, I can address all of the ITL at once.

    So now, let's put together the entire [query] rewrite to see the final product:

    [query
        list=1
        ml=500
        hashref=1
        sql="
            SELECT *,
                '[if global DEVELOPMENT]1[/if]' AS is_development,
                [sql-quote][cgi show_inventory][/sql-quote] AS show_inventory,
                COALESCE(is_shipped,'pending') AS show_inventory_shipped
            FROM transactions
            WHERE status = 'pending'
            ORDER BY order_date DESC
        "
    ]
    Order [sql-param order_number]
    [if-sql-param is_development]
        Show [sql-exec stats_crunch][sql-param stats][/sql-exec], only of interest to developers
    [/if-sql-param]
    Date: [sql-filter convert_date."%b %d, %Y at %T"][sql-param order_date][/sql-filter]
    [if-sql-param show_inventory]
    [sql-sub show_inventory]
        my $arg = shift;
        return $Tag->inventory({ type => $arg, sku => $Row->{sku} });
    [/sql-sub]
    Inv:
    [if-sql-param show_inventory eq all]
    * Shipped: [sql-param show_inventory_shipped]
    * Count: [sql-exec show_inventory]shipped[/sql-exec]
    [/if-sql-param]
    * On Hand: [sql-exec show_inventory]on_hand[/sql-exec]
    * Sold: [sql-exec show_inventory]sold[/sql-exec]
    * Shipping: [sql-exec show_inventory]shipping[/sql-exec]
    [/if-sql-param]
    [sql-sub area_order_view]
        my $show_status = $CGI->{show_status} || 'active';
        return $Tag->area({
            href => 'order_view',
            form => "order=$Row->{order_number}n"
                  . "show_status=$show_statusn"
                  . "show_inventory=$CGI->{show_inventory}",
        });
    [/sql-sub]
    Order details:
        <a href="[sql-exec area_order_view][/sql-exec]">View [sql-param order_number]</a>
    [/query]
    

    Voila! Our new query body is functionally identical to the original body, though admittedly a little more complicated to set up. However, the trade-off in efficiency is likely to be substantial.

    I recently worked on a refactor for a client that was overall very similar to the above example, with a desired N value of 250. The code prior to refactoring took ~70s to complete. Once we had completed the refactor using the same tools as I've identified here, we brought down processing time to just under 3s, losing no functionality.

    Time taken optimizing Interchange loops will almost always pay dividends.


    published by noreply@blogger.com (Richard Templet) on 2015-02-07 01:30:00 in the "DevOps" category
    It is becoming more common for developers to not use the operating system packages for programming languages. Perl, Python, Ruby, and PHP are all making releases of new versions faster than the operating systems can keep up (at least without causing compatibility problems). There are now plenty of tools to help with this problem. For Perl we have Perlbrew and plenv. For Ruby there is rbenv and RVM. For Python there is Virtualenv. For PHP there is PHP version. These tools are all great for many different reasons but they all have issues when being used with cron jobs. The cron environment is very minimal on purpose. It has a very restrictive path, very few environment variables and other issues. As far as I know, all of these tools would prefer using the env command to get the right version of the language you are using. This works great while you are logged in but tends to fail bad as a cron job. The cron wrapper script is a super simple script that you put before whatever you want to run in your crontab which will ensure you have the right environment variables set.
    #!/bin/bash -l
    
    exec "$@"
    
    The crontab entry would look something like this:
    34 12 * * * bin/cron-wrapper bin/blog-update.pl
    
    The -l on the executing of bash makes it act like it is logging in. Therefore it picks up anything in the ~/.bash_profile and has that available to the env command. This means the cron job runs in the same environment that is setup when you run it from the command line, helping to stop those annoying times where it works fine from the command line but breaks in cron. Jon Jensen went into much greater detail on the benefits of using the -l here. Hope this helps!

    published by noreply@blogger.com (Patrick Lewis) on 2015-02-06 13:00:00 in the "plugin" category

    When I started using Vim I relied on tree-based file browsers like netrw and NerdTree for navigating a project's files within the editor. After discovering and trying the CtrlP plugin for Vim I found that jumping directly to a file based on its path and/or filename could be faster than drilling down through a project's directories one at a time before locating the one containing the file I was looking for.


    After it's invoked (usually by a keyboard shortcut) CtrlP will display a list of files in your current project and will filter that list on the fly based on your text input, matching it against both directory names and file names. Pressing <control-f> with CtrlP open toggles through two other modes: most recently used files, and current buffers. This is useful when you want to narrow down the list of potential matches to only files you have worked with recently or currently have open in other buffers. I use CtrlP's buffer mode to jump between open files so often that I added a custom mapping to invoke it in my .vimrc file:

    map <leader>b :CtrlPBuffer

    CtrlP has many configuration options that can affect its performance and behavior, and installing additional plugins can provide different matcher engines that search through a directory more quickly and return more relevant results than the default matcher. Alternate matchers I've found include:


    Of these, I've had the best luck with FelixZ's ctrlp-py-matcher. It's easy to install, works on most systems without requiring additional dependencies, and manages to be both faster and return more relevant results than the built-in CtrlP matcher.

    CtrlP is well documented in both its README (available on its GitHub project page) and its Vim documentation (available with :help ctrlp within Vim). The documentation covers the different commands and configuration options provided by CtrlP but simply installing the plugin and hitting <control-p> on your keyboard is enough to get you started with a faster way to navigate between files in any codebase.


    published by noreply@blogger.com (Jeff Boes) on 2015-02-05 14:00:00 in the "Dancer" category
    Inserting content into Dancer output files involves using a templating system. One such is Template::Flute. In its simplest possible explanation, it takes a Perl hash, an XML file, and an HTML template, and produces a finished HTML page.
    For a project involving Dancer and Template::Flute, I needed a way to prepare each web page with its own set of JavaScript and CSS files. One way is to construct separate layout files for each different combination of .js and .css, but I figured there had to be a better way.
    Here's what I came up with: I use one layout for all my typical pages, and within the header, I have:
     
     
    
    The trick here is, there's no such files "additional.css" and "additional.js". Instead, those are placeholders for the actual CSS and JS files I want to link into each HTML file.
    My Perl object has these fields (in addition to the other content):
        $context->{additional_styles}    = [
            { url => '/css/checkout.css' },
            { url => '/css/colorbox.css' },
        ];
        $context->{additional_scripts}   = [
            { url => '/javascripts/sprintf.js' },
            { url => '/javascripts/toCurrency.js' },
            { url => '/javascripts/jquery.colorbox-min.js' },
            { url => '/javascripts/checkout.js' },
        ];
    
    while my XML file looks like this:
    
    
      
    
    
      
    
    
    

    So we have all the elements, but unless you have used all this before, you may not realize how we get the output. (Skip to the punchline if that's not true.)
    The XML file is a connector that tells Template::Flute how to mix the Perl hash into the HTML template. Usually you connect things via class names, so in the case of:
    
    
    the class name in the HTML and the name field in the XML connect, while the iterator field in the XML and the hash key in the Perl hashref do as well. The case of a
    <list>
    means that the hash value must be an arrayref of hashrefs, i.e.,
     {
      "additional_style" => [
        { url => "...", },
        ...,
       ],
     }
    
    Important note: if the hash value is undefined, you'll get a run-time error when you try to expand the HTML template, and if you have an empty arrayref, the result of the expansion is an empty string (which is just what you want).
    And so, through the magic of Template::Flute, what the browser sees is:
    
    
    ...
    ...
    
    
    
    
    

    published by noreply@blogger.com (Kamil Ciemniewski) on 2015-02-05 09:57:00 in the "AngularJS" category

    Some time ago, our CTO, Jon Jensen, sent me a link to a very interesting blog article about AngularJS. I have used the AngularJS framework in one of our internal projects and have been (vocally) very pleased with it ever since. It solves many problems of other frameworks and it makes you quite productive as a developer, if you know what you?re doing. It?s equally true that even the best marketed technology is no silver bullet in real life. Once you?ve been through a couple of luckless technology-crushes, you tend to stay calm ? understanding that in the end there?s always some tradeoff. We?re trying to do our best at finding a balance between chasing after the newest and coolest, and honoring what?s already stable and above all safe. Because the author of the blog article decided to point at some elephants in the room ? it immediately caught our attention. I must admit that the article resonates with me somewhat. I believe, though, that it also doesn?t in some places. While I don?t have as much experience with Angular as this article?s author, I clearly see him sometimes oversimplifying, overgeneralizing, and being vague. I?d like to address many of the author's points in detail, so I will quote sections of the article. The author says: "

    My verdict is: Angular.js is ?good enough? for majority of projects, but it is not good enough for professional web app development. ? When I say ?professional web app? I mean the app, which is maintainable in a long run, performant in all reasonably modern browsers, has a smooth UX and is mobile-friendly.
    "

    The first example that meets those requirements is our internal project. Saying that Angular isn?t a good fit for author?s definition of ?professional web apps? is IMHO a huge overgeneralization. Jon Jensen shared with me the following thoughts on this: "

    It is also worth asking what is meant by ?maintainable in the long run?, since pretty much any web application will need a significant overhaul within 5 years or so, and heavy browser-based JavaScript apps are more likely to need a major overhaul sooner than that. That's partly because front-end technology and browser competition is moving so quickly, but also because JavaScript frameworks are improving so rapidly. It's impossible to predict whether a given framework will be maintained for 5 years, but even more impossible to say whether you would want to keep using it after that long.
    "

    Those questions make sense to me. The long run may not be so long for modern JavaScript apps. Later the blog writer asks: "

    Are there any use cases where Angular shines?
    • Building form-based "CRUD apps".
    • Throw-away projects (prototypes, small apps).
    • Slow corporate monoliths, when performance does not matter and maintenance costs are not discussed (hm, but have you looked at ExtJS?)
    " This is true but is also constrained IMHO. Counter examples are to be found all over the Internet. The YouTube application for Sony's PlayStation 3 is only one of them. One can use e.g. https://builtwith.angularjs.org to browse others.

    "
    And what are no-no factors for angular?

    • Teams with varying experience.
    • Projects, which are intended to grow.
    • Lack of highly experienced frontend lead developer, who will look through the code all the time.
    • Project with 5 star performance requirements.
    " I agree with the last one. Other ones sprout from the fact that Angular is so liberal in how the app may be structured. In some cases it?s good, while in some bad ? there?s always some tradeoff. I?d compare Angular to Sinatra and Ember to Rails. Both are intended to be used in different use cases. One isn?t superior to another without a context.

    "

    It there any Working Strategy, if you are FORCED to work with angular?

    • Taking angular for fast prototyping is OK, hack it and relax.
    • After the prototype is proved to be the Thing-To-Go, kill the prototype. DO NOT GROW IT!
    • Sit and analyze the design mistakes you've made.
    • Start a fresh new project, preferably with other tech stack.
    • Port the functionality from the prototype to your MVP.
    " Agreed with #1. Maintainability isn?t trivial with Angular ? it?s true. One reason is that with dependency injection there?s a possibility that with growing number of modules, some tries will depend on each other in a circular way: A -> B -> C -> A

    But it?s not inherent to Angular but to dependency injection itself and there are known strategies for dealing with that. Other reason is that it?s so liberal and yes ? you have to always be alert, making sure the code grows in the good direction. It?s also true that many teams who have previously been using other MV{C,P} frameworks ? are converting to Angular. Why? I gave the answer in the first paragraph ? there?s no silver bullet. If you want a truly orthogonal software you don?t grow it with just great tools ? but with great people. And sometimes even having a star-level team isn?t enough because of the degree to which business requirements change.

    Then: "

    If you still need to grow your project and maintain it in the future:
    • Accept the fact that you will suffer in the future. The lowered expectations will help you stay happy sometimes.
    • Create a thorough guideline based on the popular things (this, this and that) covering all the use cases and patterns you can imagine.
    • Try to keep things as loosely coupled as possible with your OOD knowledge.
    • Choose either MVC or MVVM, but do not start by mixing approaches.
    • Include "refactoring" iterations in your dev process (good interval - each 3 months).
    • Analyze your usage patterns and use cases periodically.
    • Create a metaframework based on angular, tailored SPECIFICALLY for your project needs and your team experience!
    " Agreed with all of those. I?d agree with those really for any technology there?s out there.

    Then the author says: "

    Dependency injection lacks some functionality you will need sometime.
    "

    That intrigues me and I?ll look for similar opinions by others that explain in details why the writer thinks that dependency injection will leave me stuck without needed functionality someday. Then: "

    Directives are overloaded with responsibilities (ever wonder why and when you should use isolated scope? Yeah, that's only the tip of the iceberg).
    "

    I don?t really think that?s a problem because a directive only has the amount of responsibility you make it have. One can use or abuse any technology so this point doesn?t really resonate with me. Then:

    "

    Modules are not real modules.
    • No CommonJS/AMD.
    • No custom lazy loading.
    • No namespaces.
    You can use modules only to specify the dependency graph and get rid of specifying the correct file order for injecting scripts (which is not a problem anyway if you are using component-based structure and, for example, browserify).
    "

    That?s only a half-truth. You can use e.g. RequireJS and have ?real modules? with Angular ? there?s even a good blog article describing how to do it: http://www.sitepoint.com/using-requirejs-angularjs-applications/. If you were to use just Angular-flavored modules one issue you might run into though could be name clashes. But then unless you want to use a dozen of 3rd party modules you find on GitHub ? name clashes aren?t a real problem out there in the wild. And also if you do want to use those modules, you cannot expect to have a ?maintainable? codebase over time anyway can you?

    "

    $scope is "transparent" and inherited by default. Inheritance is known to be an antipattern in OOD. (Proof?) You MUST know the cases, when it can be useful. Angular forces you to get rid of this inheritance all the time.

    "

    I somewhat agree. Managing scopes is sometimes a pain.

    "

    Bidirectional binding is unpredictable and hard to control (unless you know that you MUST control it).

    "

    That for me falls under the ?use or misuse potential? category. I can?t see it causing any problem unless you create a huge nest of dependent variables and want then to debug if it goes wrong (there are cleaner ways to achieve the same results).

    "

    Transparent scope and "referential hell" mean that you CANNOT KNOW what part of the system will be updated when you introduce a change using $scope.$apply(). You have no guarantees. This is a design tradeoff. Do you know that each time you call $scope.$apply() you actually call $rootScope.$apply()? And this call updates all scopes and run all your watches? Moreover, $rootScope.$apply() is called each time when: $timeout handler is invoked (almost all debounce services are broken by design) $http receives a response (yeah, if you have a polling implemented on $http ...) any DOM handler is called (have you throttled your ng-mouseovers? They actually invoke ALL your $watches, and built-in digest phasing does not really help) If you know, that some change is localised (like, if you click the button, only the same $scope will be affected), then you MUST use $scope.$digest. But again, you will face nasty "$digest is already in progress" issue...

    "

    This is a huge annoyance. He?s right about it. Then:

    "

    Yes, angular is complex and have a terrible learning curve. The worst thing is that you are learning framework for sake of learning framework.

    "

    I?d say quite the contrary is true. When we switched to Angular with the our internal app, no-one on the team had any experience with it. The team was ranging in experience ? from ?not much outside of jQuery? to some much more experienced with many JavaScript frameworks. Yet the team started producing much more almost right away. I also heard them saying that Angular is much easier than our previous setup ? which was Backbone + KnockoutJS.

    Then:

    "

    90% of those modules in the wild are broken by design.
    • They are not really performant
    • They do not scale
    • They misuse some of angular features
    • They are badly designed and forces bad practices
    • But hey, who cares? They do their job anyway, right?
    " This is true. I?d just add that it?s not really only inherent to Angular. If you?ve been a developer long enough you can recall probably hundreds of hours of fighting with someone else?s code which doesn?t work the way you?d expect it or was marketed as. The problem is there whether you?re trying some Angular modules, other JavaScript libraries and their plugins or other languages libraries too. You always have to be very careful when pulling in third party code into your application.

    "

    • Those docs suck. They still suck.
    • There are no reference projects.
    • There is no reference project structure.
    • No one share their experience with you through the framework.
    • Yes, those practices can be overwhelming (I am looking at you, Ember). But it's better to have overwhelming practices, than to have none.
    • Some encoded practices are questionable:
    • Blocking the UI while resolving all promises? Really?
    • No subrouting? Hmmm
    " I somewhat agree and somewhat disagree with those. There are reference projects on GitHub. The documentation was my friend really. Not having a rigid standard of doing things is good or bad only within a context.

    I don?t want to come across as someone who thinks he has all the answers and knows more than others. This is just my perspective regarding the things enclosed in this article. I?d say mostly I agree with the author (with exceptions stated above). I?d also say that I cannot see any technology that would be entirely free of shortcomings. I think it?s worth noting that Angular team seems to be aware of them. They?re already working on the next version: Angular 2. You can read some more about this project here: http://ng-learn.org/2014/03/AngularJS-2-Status-Preview/. There?s nothing perfect in this world, but the best thing we can do is to continuously follow the path of constant never-ending improvement.

    published by noreply@blogger.com (Emanuele 'Lele' Calo') on 2015-02-02 21:28:00 in the "brussels" category

    It's Sunday evening, the 2015 FOSDEM conference is over, so it's time to give my impressions and opinions.

    As I already said in my day 1 blog post, I really enjoyed the conference as I usually do in this kind of open source, enthusiast, knowledge sharing conference.

    Now that's also where the FOSDEM somehow slightly disappointed me: I lived it as yet another conference with a series of talks and some nice ideas here and there.

    When I heard about the FOSDEM I always heard about a developer-centric conference so my first natural conclusion was that there probably would have been a lot of cooperative hand-on coding, some pair programming, huge hackathon rooms and so on. Unfortunately this wasn't the case, from what I could see.

    Don't get me wrong: I really loved all the talks I attended and really appreciated all the input I had, which is awesome.

    But as you may know when expectations fall short you always remain with a little bit of bitter taste wandering inside you.

    That said I really loved the talks about KVM Observability from Stefan Hajnoczi which provided me a lot of interesting hints and tools I could use straight away when coming back at work and the NUMA architecture in oVirt from Doron Fediuck which gave an incredibly useful and clear introduction to NUMA systems.

    I was also really interested to hear about the latest news on the CentOS SIGs project and what we could expect there in the future.

    The nice aspect of this kind of conference is that you happen to know a lot of incredibly talented people which would be difficult to meet otherwise. As an example I was lucky enough to talk with Luca Gibelli which is one of the main initial contributors of ClamAV, who has since moved on to other interesting projects.

    For next year I'd really love to see some more space dedicated encourage real hand-on programming like help request/proposals zone where FOSDEM visitor could meet and work together, create and enjoy the marvelous joys of pair programming.

    I really want to thank all the volunteers and speakers that made the FOSDEM possible. It was a really great experience and I'll definitely try to be here again in future!