All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors, or our affiliates.
  Add to My Yahoo!  Subscribe with Bloglines  Subscribe in NewsGator Online

published by noreply@blogger.com (Kent K.) on 2015-01-27 14:00:00 in the "cron" category

I recently got the opportunity to pick up development on a Ruby on Rails application that was originally setup to run on AWS using their Elastic Beanstalk deployment tools. One of our first tasks was to move some notification hooks out of the normal workflow into scripts and schedule those batch scripts using cron.

Historically, I've had extremely good luck with Whenever. In my previous endeavors I've utilized Capistrano which Whenever merged with seamlessly. With how simple it was to integrate Whenever with Capistrano, I anticipated a similar experience dealing with Elastic Beanstalk. While the integration was not as seamless as Capistrano, I did manage to make it work.

My first stumbling block was finding documentation on how to do after or post hooks. I managed to find this forum post and this blog post which helped me out a lot. The important detail is that there is a "post" directory to go along with "pre" and "enact", but it's not present by default, so it can be easy to miss.

I used Marcin's delayed_job config as a base. The first thing I had to address was an apparent change in Elastic Beanstalk's configuration structure. Marcin's config has

  . /opt/elasticbeanstalk/support/envvars
but that file doesn't exist on the system I was working on. With a small amount of digging, I found:
  . /opt/elasticbeanstalk/containerfiles/envvars
in one of the other ebextensions. Inspecting that file showed a definition for and exportation of $EB_CONFIG_APP_CURRENT suggesting this is a similar file just stored in a different location now.

Another change that appears to have occurred since Marcin developed his config is that directories will be created automatically if they don't already exist when adding a file in the files section of the config. That allows us to remove the entire commands section to simplify things.

That left me with a config that looked like:

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/99_update_cron.sh"
  mode: "000755"
  owner: root
  group: root
  content: |
    #! /usr/bin/env bash
    . /opt/elasticbeanstalk/containerfiles/envvars
    su -c "cd $EB_CONFIG_APP_CURRENT; bundle exec whenever --update-cron" - $EB_CONFIG_APP_USER

This command completed successfully but on staging the cron jobs failed to run. The reason for that was an environment mismatch. The runner entries inside the cron commands weren't receiving a RAILS_ENV or other type of environment directive so they were defaulting to production and failing when no database was found.

After some greping I was able to find a definition for RACK_ENV in:

/opt/elasticbeanstalk/containerfiles/envvars.d/sysenv
Making use of it, I came up with this final version:
files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/99_update_cron.sh"
  mode: "000755"
  owner: root
  group: root
  content: |
    #! /usr/bin/env bash
    . /opt/elasticbeanstalk/containerfiles/envvars
    . /opt/elasticbeanstalk/containerfiles/envvars.d/sysenv
    su -c "cd $EB_CONFIG_APP_CURRENT; bundle exec whenever --update-cron --set='environment=$RACK_ENV'" - $EB_CONFIG_APP_USER


published by Eugenia on 2015-01-26 20:22:59 in the "Filmmaking" category
Eugenia Loli-Queru

I was close on getting a Panasonic LX100 for its 4k video, but then I found a deal at Amazon for the Canon S110, for just $180 (1/5th of the LX100’s price). The S110 doesn’t have 4k or full manual control, but it does have the bare minimum to be able to shoot nice videos: exposure compensation & lock, manual focus, flat colors, an ND filter, and 1080/24p at a good bitrate. If you half-press the shutter button, it also gives you the shutter speed, so you might be able to lock the exposure at a shutter speed close to 1/48th, to achieve an even more filmic look. The camera has a larger sensor and faster lens than most P&S cameras, so for the price, it was a steal. I haven’t shot anything interesting with it yet, but so far, I like what I’m seeing.


published by noreply@blogger.com (Spencer Christensen) on 2015-01-22 22:35:00 in the "CentOS" category

We use a variety of hosting providers for ourselves and our clients, including Hetzner. They provide good servers for a great price, have decent support, and we've been happy with them for our needs.

Recently I was given the task of building out a new development server for one of our clients, and we wanted it to be set up identically to another one of their servers but with CentOS 7. I placed the order for the hardware with Hetzner and then began the procedure for installing the OS.

Hetzner provides a scripted install process that you can kick off after booting the machine into rescue mode. I followed this process and selected CentOS 7 and proceeded through the whole process without a problem. After rebooting the server and logging in to verify everything, I noticed that the disk space was capped at 2 TB, even though the machine had two 3 TB drives in it (in hardware RAID 1). I looked at the partitions and found the partition table was "msdos". Ah ha!

At this point painful memories of running into this problem before hit me. I reviewed our notes of what we had done last time, and felt like it was worth a shot even though this time I'm dealing with CentOS 7. I went through the steps up to patching anaconda and then found that anaconda for CentOS 7 is newer and the files are different. I couldn't find any files that care about the partition table type, so I didn't patch anything.

I then tried to run the CentOS 7 install as-is. This only got me so far because I then ran into trouble with NetworkManager timing out and not starting.

screen shot of CentOS 7 installer failing
A screenshot of the CentOS 7 installer failing (anaconda) similar to what I was seeing.

Baffled, I looked into what may have been causing the trouble and discovered that the network was not set up at all and it looked as if no network interfaces existed. WHAT?? At this point I dug through dmesg and found that the network interfaces did indeed exist but udevd had renamed them. Ugh!

Many new Linux distributions are naming network interfaces based on their physical connection to the system: those embedded on the motherboard get named em1, em2, etc. Apparently I missed the memo on this one, as I was still expecting eth0, eth1, etc. And from all indications, so was NetworkManager because it could not find the network interfaces!

Rather than spend more time going down this route, I decided to change gears and look to see if there was any way to patch the Hetzner install scripts to use a GPT partition table with my install instead of msdos. I found and read through the source code for their scripts and soon stumbled on something that just might solve my problem. In the file /root/.oldroot/nfs/install/functions.sh I found mention of a config variable FORCE_GPT. If this is set to "1" then it will try to use a GPT partition table unless it thinks the OS won't like it, and it thinks that CentOS won't like it (no matter the version). But if you set FORCE_GPT to "2" it will use a GPT partition table no matter what. This config setting just needs to be added to the file you edit where you list out your partitions and LVM volumes.

FORCE_GPT 2                                                                                                      

PART /boot ext3 512M                                                                                             
PART lvm   vg0  all                                                                                              
                                                                                                                 
LV  vg0  swap swap   swap  32G                                                                                   
LV  vg0  root  /     ext4 100G                                                                                   
LV  vg0  home  /home ext4 400G                                                                                   

I then ran the installer script and added the secret config option and... Bingo! It worked perfectly! No need to manually patch anything or install manually. And now we have a CentOS 7 server with full 3 TB of disk space usable.

(parted) print                                                            
Model: DELL PERC H710 (scsi)
Disk /dev/sda: 3000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: pmbr_boot

Number  Start   End     Size    File system  Name  Flags
 3      1049kB  2097kB  1049kB                     bios_grub
 1      2097kB  539MB   537MB   ext3
 2      539MB   3000GB  2999GB                     lvm

published by noreply@blogger.com (David Christensen) on 2015-01-22 15:53:00 in the "postgres" category
I'm excited to have my talk "Choosing a Logical Replication System" accepted to PGConf.US! I'll be speaking on Friday, March 27th from 2:00 - 2:50, as part of the Strategy track.

In this talk I will cover a variety of existing Logical Replication systems for PostgreSQL and go over some of the differences between requirements, supported capabilities, and why you might choose one system over another. I'll also cover some of the changes in PostgreSQL 9.4.

Read about the talk here.


published by noreply@blogger.com (Greg Sabino Mullane) on 2015-01-21 22:21:00 in the "chrome" category

A little while ago, I bought a Chromebook as an alternative to my sturdy-but-heavy laptop. So far, it has been great - quick boot up, no fan, long battery life, and light as a feather. Perfect for bringing from room to room, and for getting some work done in a darkened bedroom at night. The one large drawback was a lack of SSH, a tool I use very often. I'll describe how I used one-time passwords to overcome this problem, and made my Chromebook a much more productive tool.

The options for using SSH on Chrome OS are not that good. I downloaded and tried a handful of apps, but each had some significant problems. One flaw shared across all of them was a lack of something like ssh-agent, which will cache your SSH passphrase so that you don't have to type it every time you open a new SSH session. An option was to use a password-less key, or a very short passphrase, but I did not want to make everything less secure. The storage of the SSH private key was an issue as well - the Chromebook has very limited storage options, and relies on putting most things "in the cloud".

What was needed was a way to use SSH in a very insecure environment, while providing as much security as possible. Eureka! A one-time password system is exactly what I needed. Specifically, the wonderful otpw program. Chromebooks have a simple shell (accessed via ctrl-alt-t) that has SSH support. So the solution was to use one-time passwords and not store anything at all on the Chromebook.

Rather than trying to get otpw setup on all the servers I might need to reach, I simply set it up on my main laptop, carefully allowed incoming SSH connections, and now I can ssh from my Chromebook to my laptop. From there, to the world. Best of all, when I ssh in, I can use the already running ssh-agent on the laptop! All it takes is memorizing a single passphrase and securing a sheet of paper (which is far easier to secure than an entire Chromebook :)

Here are some details on how I set things up. On the Chromebook, nothing is needed except to open up a crosh tab with ctrl-alt-t, and run ssh. On the laptop side, the first step is to install the otpw program, and then configure PAM so that it uses it:

$ sudo aptitude install otpw-bin
$ sudo cat >> /etc/pam.d/ssh
  auth     required  pam_otpw.so
  session  optional  pam_otpw.so

That is the bare minimum, but I also wanted to make sure that only 'local' machines could SSH in. While there are a number of ways to do this, such as iptables or /etc/hosts.allow, I decided the best approach was to configure sshd itself. The "Match" directive instructs that the lines after it only take effect on a positive match. Thus:

$ sudo cat >> /etc/ssh/sshd_config
AllowUsers nobodyatall
Match Address 192.168.1.0/24,127.0.0.0
AllowUsers greg
$ service ssh restart

The next step is to create the one-time password list. This is done with the otwp-gen program; here is the command I use:

$ otpw-gen -e 30 | lpr
Generating random seed ...

If your paper password list is stolen, the thief should not gain access to your account with this information alone. Therefore, you need to memorize and enter below a prefix password. You will have to enter that each time directly before entering the one-time password (on the same line).

When you log in, a 3-digit password number will be displayed.  It identifies the one-time password on your list that you have to append to the prefix password. If another login to your account is in progress at the same time, several password numbers may be shown and all corresponding passwords have to be appended after the prefix password. Best generate a new password list when you have used up half of the old one.

Enter new prefix password: 
Reenter prefix password: 

Creating '~/.otpw'.
Generating new one-time passwords ...

The otpw-gen command creates a file named .otpw in your home directory, which contains the hash of all the one-time passwords to use. In the example above, the -e controls the entropy of the generated passwords - in other words, how long they are. otpw-gen will not accept an entropy lower than 30, which will generate passwords that are five characters long. The default entropy, 48, generates passwords that are eight characters long, which I found a little too long to remember when trying to read from the printout in a dark room. :). Rather than show the list of passwords on the screen, or save them to a local file, the output goes directly to the printer. otpw-gen does a great job of formatting the page, and it ends up looking like this:

Here are some close-ups of what the passwords look like at various entropies:

Sample output with a low entropy of 30:
OTPW list generated 2015-07-12 13:23 on gregsbox

000 GGS%F  056 bTqut  112 f8iJs  168 lQVjk  224 gNG2x  280 -x8ke  336 egm5n
001 urHLf  057 a/Wwh  113 -PEpV  169 9ABpK  225 -K2db  281 babfX  337 feeED
002 vqrX:  058 rZszx  114 r3m8a  170 -UzX3  226 g74RI  282 gusBJ  338 ;Tr4m
003 fa%6G  059 -i4FZ  115 nPEaJ  171 o64FR  227 uBu:h  283 uBo/U  339 ;pYY8
004 -LYZY  060 vWDnw  116 f5Sb+  172 hopr+  228 rWXvb  284 rksPQ  340 ;v6GN
Sample output with the default entropy of 48:
OTPW list generated 2015-15-05 15:53 on gregsbox

000 tcsx qqlb  056 ougp yuzo  112 lxwt oitl  168 giap vqsj  224 vtvk rjc/
001 mfui ukph  057 wbpw aktt  113 kert wozj  169 ihed psyx  225 ducx pze=
002 wwsj hdcr  058 jmwa mguo  114 idtk zrzw  170 ecow fepm  226 ikru hty+
003 aoeb klnz  059 pvie fbfc  115 fmlb sptb  171 ftrd jotb  227 mqns ivq:
004 yclw hyml  060 slvj ezfi  116 djsy ycse  172 butg guzm  228 pfyv ytq%
005 eilj cufp  061 zlma yxxl  117 skyf ieht  173 vbtd rmsy  229 pzyn zlc/
Sample output with a high entropy of 79:
OTPW list generated 2015-07-05 18:74 on gregsbox

000 jeo SqM bQ9Y ato  056 AyT jsc YbU0 rXB  112 Og/ I3O 39nY W/Z
001 AFk W+5 J+2m e1J  057 MXy O9j FjA8 8q;  113 a6A 8R9 /Ofr E4s
002 02+ XPB 8B2S +qT  058 Cl4 6g2 /9Bk KO=  114 HEK vd3 T2TT Rr.
003 Exb jqE iK49 rfX  059 Qhz eU+ J2VG kwQ  115 aJ7 tg1 dJsr vf.
004 Bg1 b;5 p0qI f/m  060 VKz dpa G7;e 7jR  116 kaL OSw dC8e kx.

The final step is to SSH from the Chromebook to the laptop! Hit ctrl-alt-t, and you will get a new tab with a crosh prompt. From there, attempt to ssh to the laptop, and you will see the usual otpw prompt:

$ ssh greg@192.168.1.10
Password 140: 

So you type in the passphrase you entered above when running the otpw-gen command, then pull out your sheet of paper and look up the matching password next to number 140. Voila! I am now connected securely to my more powerful computer, and can SSH from there to anywhere I am used to going to from my laptop. I can even run mutt as if I were at the laptop! A nice workaround for the limitations of the Chromebook.


published by noreply@blogger.com (Jeff Boes) on 2015-01-21 14:00:00 in the "perl" category

Recently I worked on a project using the Perl web application framework Dancer that had multiple paths to order a product:

 /product => /cart => /checkout => /receipt

That's the standard approach. Then there was a "phone order" approach:

 /create_order => /checkout => /receipt

A "phone order" is one taken down (usually by phone), where the user who is logged in is not the same as the user who "owns" the order. Thus, one user is ordering on behalf of another: the order must be recorded as part of the second user's order history, the various shipping and billing information must come from that user's stored information, and even the product pricing has to be calculated as though that customer were doing the ordering rather than the logged-in user.

As a consequence, the phone order page flow actually ended up as:

 get /create_order => post /create_order => /checkout

The submission of the /create_order page was processed in an environment that knew about this "proxy" ordering arrangement, thus could do some particularly special-case processing, and then the idea was to pass off to the /checkout page, which would finalize the order including payment information.

All well and good, but when it came time to implement this, I was faced with a minor inconvenience and a bad choice:

Since /checkout was itself a POSTed page, I needed to reach that page with a set of form parameters in hand. So my original plan was:

 post '/create_order' => sub {
   ... # do my special-case processing, and then:
   forward '/checkout', { param1 => $value1, ... };
 };

While this works, the problem is that "forward" as a Dancer directive doesn't interact with the browser: it just interrupts your path handling of "/create_order" and resumes at "/checkout". So the browser, innocent of these shenanigans, remains on "/create_order". It would be so much cleaner (darn my OCD!) if the browser ended up at "/checkout".

That means you need to redirect the request, though. I.e.,

 post '/create_order' => sub {
   ... # do my special-case processing, and then:
   redirect '/checkout';  # hmm, something's missing here
 };

Hmm, redirect doesn't support a parameter hash. Oh, well, no problem:

   redirect url_for('/checkout', { param1 => $value1, ... });

That gets the job done, but at a price: now instead of a nice, clean URL at my final destination, I get:

   .../checkout?param1=value1&param2=...

So, still not right. Some research and mailing-list inquiries let me to:

Why doesn't HTTP have POST redirect?

Short version: you can't get there from here. Redirection is supposed to be "idempotent", meaning you can repeat them without harm. That's why when you refresh the page after a form submission, browsers will ask for permission to re-submit the form rather than just silently refreshing the page.

So what's the option? Well, I can think of two approaches here:

One: instead of redirecting with parameters, store the parameters in the session:

 post '/create_order' => sub {
   ... # do my special-case processing
   session 'create_order_for_checkout' => { param1 => $value1, ... };
   redirect '/checkout';
 };
 post '/checkout' => sub {
   my $params = (session 'create_order_for_checkout')
     || params();
   ...
 };

Two: do away with the post handler for '/create_order' altogether, and move the processing inside the post handler for '/checkout'. The merits of that depend on how complex the /create_order handler is.

I'm leaning toward the first approach, definitely.


published by noreply@blogger.com (Jeff Boes) on 2015-01-14 18:00:00 in the "https" category

A co-worker innocently pointed me to the documentation for a JS package here:

http://olado.github.io/doT/index.html

Then he couldn't figure out why I could not glean the necessary info to proceed in my task from this page. We went back and forth on IRC until, in understandable frustration, he sent me a screen shot of the page that had the info he was trying to convey.

Surprisingly, that screen shot had more "stuff" on it than the page I was looking at.

I sent him my screen shot, and he was just as puzzled as I was. (For those looking at the doT page, I was losing all of the tabs under "Usage".)

Eventually, since the concept of JS was involved, I peeked at the console log for the page:

Blocked loading mixed active content "http://code.jquery.com/jquery.min.js"

I figured out that "HTTPS Everywhere" was at fault: it was blocking the jQuery library from loading, because the author of the doT page had hard-coded the link as "http":


So, we have a page that describes a JS plug-in, which un-gracefully fails when JS is disabled or when a JS library is not available. Ugh. #fail.

Always make sure your page does something useful when (not if) the inevitable failure of a resource occurs.


published by noreply@blogger.com (Matt Galvin) on 2015-01-14 14:53:00 in the "ecommerce" category

Hello again all. I like to monitor the orders and exceptions of the Spree sites I work on to ensure everything is working as intended. One morning I noticed an unusual error: "invalid value for Integer(): "09"" in Spree::Checkout/update on a Spree 2.1.x site.

The Issue

Given that this is a Spree-powered e-commerce site, a customer's inability to checkout is quite alarming. In the backtrace I could see that a string of "09" was causing an invalid value for an integer. Why hadn't I seen this on every order in that case?

I went into the browser and completed some test orders. The bug seemed to affect only credit cards with a leading "0" in the expiration month, and then only certain expiration months. I returned to the backtrace and saw this error was occurring with Active Merchant. So, Spree was passing Active Merchant a string while Active Merchant was expecting an integer.

Armed with a clearer understanding of the problem, I did some Googling. I came across this post. This post describes the source of this issue as being the behavior of sprintf which I will describe below. This topic was discussed in the Ruby Forum.

Octal Numbers

As per Daniel Martin on the aforementioned post:

  • sprintf("%d",'08') ==> ArgumentError
  • sprintf("%d",'8') ==> "8"
  • sprintf("%d",'08'.to_i) ==> "8"
  • sprintf("%f",'08') ==> "8.000000"

As you can see, sprintf cannot convert '08' or '09' to a decimal. Matthias Reitlinger notes, "

"%d tells sprintf to expect an Integer as the corresponding argument. Being given a String instead it tries to convert it by calling Kernel#Integer"
"

In the same post, we can review some documentation of Kernel#Integer



We can see here that if the argument being provided is a string (and it is since that is what Spree is sending), the "0" will be honored. Again, we know

sprintf("%d",'01') => "1" | sprintf("%d", 01) => "1"
sprintf("%d",'02') => "2" | sprintf("%d", 02) => "2"
sprintf("%d",'03') => "3" | sprintf("%d", 03) => "3"
sprintf("%d",'04') => "4" | sprintf("%d", 04) => "4"
sprintf("%d",'05') => "5" | sprintf("%d", 05) => "5"
sprintf("%d",'06') => "6" | sprintf("%d", 06) => "6"
sprintf("%d",'07') => "7" | sprintf("%d", 07) => "7"
sprintf("%d",'08') => error | sprintf("%d", 08) => error
sprintf("%d",'09') => error | sprintf("%d", 09) => error

By pre-prepending the "0" to the numbers, they are being marked as 'octal'. Wikipedia defines octal numbers as

"

"The octal numeral system, or oct for short, is the base-8 number system, and uses the digits 0 to 7. Octal numerals can be made from binary numerals by grouping consecutive binary digits into groups of three (starting from the right)."
"

So, 08 and 09 are not octal numbers.

Solution

This is why this checkout error did not occur on every order whose payment expiration month had a leading "0", only August (08) and September (09) were susceptible as the leading '0' indicates we are passing in an octal of which 08 and 09 are not valid examples of. So, I made Spree send integers (sprintf("%d",8) #=> "8" and sprintf("%d",9) #=> "9") so that the leading "0" would not get sent (thereby not trying to pass them as octals). I created a app/models/spree/credit_card_decorator.rb file with the contents

Spree::CreditCard.class_eval do
  def expiry=(expiry)
    if expiry.present?
      self[:month], self[:year] = expiry.delete(' ').split('/')
      self[:year] = "20" + self[:year] if self[:year].length == 2
      self[:year] = self[:year].to_i
      self[:month] = self[:month].to_i
    end
  end
end

After adding this, I tested it in the browser and there were no more checkout errors! I hope you've found this interesting and helpful, thanks for reading!


published by noreply@blogger.com (Marina Lohova) on 2015-01-13 15:00:00 in the "AngularJS" category

To all of you window.onResize aficionados, I dedicate this blog post because today we will be doing a lot of dynamic resizing in JavaScript. All of it will be done completely and effortlessly with my one-page long Angular directive.

Why do I need to attach an expensive onResize handler to my already overloaded page, you ask. The answer is very simple. Our app layout is pixel-perfect. Each element has the predefined width and margins. Yet, the app needs to look good on all kind of devices, from regular PC to tablet to iPhone. That's why I created the following Angular directive in /scripts/directives/tsResize.js:

angular.module('angularApp')
.directive('tsResize', function($window) {
 return function(scope, element) {
   var w = angular.element($window);
   scope.getWindowDimensions = function () {
     return {
       'h': $window.innerHeight,
       'w': $window.innerWidth
     };
   };
   scope.$watch(scope.getWindowDimensions, 
              function (newValue, oldValue) {
     scope.windowHeight = newValue.h;
     scope.windowWidth = newValue.w;

     scope.mainContainerStyle = function () {
       if (newValue.w > 890) {
         return {};
       } else {
         val = newValue.w/890;
         return {
           '-webkit-transform': 'scale(' + val + ')',
           '-o-transform': 'scale(' + val + ')',
           '-ms-transform': 'scale(' + val + ')',
           'transform': 'scale(' + val + ')',
           'transform-origin': 'left -10px',
           '-webkit-transform-origin': 'left -10px'               
         };
       }
     };
     
     scope.topBarStyle = function () {
       if (newValue.w > 890) { 
         return {};
       } else { 
         val = newValue.w/890;
         return {
           '-webkit-transform': 'scale(' + val + ')',
           '-o-transform': 'scale(' + val + ')',
           '-ms-transform': 'scale(' + val + ')',
           'transform': 'scale(' + val + ')',
           'transform-origin': '0 2px 0',
           '-webkit-transform-origin': '0 2px 0'  
         };
       }
     };
    }, true);

   w.bind('resize', function () {
     scope.$apply();
   });
  }
})

As you can see all the magic is done with transform:scale CSS attribute on the two of my main page components: the navigation and the contents container.

They styles are cross-browser.

return {
  '-webkit-transform': 'scale(' + val + ')',
  '-o-transform': 'scale(' + val + ')',
  '-ms-transform': 'scale(' + val + ')',
  'transform': 'scale(' + val + ')'             
}; 

It's important to set transform-origin, or the elements will be weirdly positioned on the page.

return {
  'transform-origin': '0 top',
  '-webkit-transform-origin': '0 top'                
}; 

The style calculations are attached to the changes of window dimensions.

scope.getWindowDimensions = function () {
  return {
    'h': $window.innerHeight,
    'w': $window.innerWidth
  };
};
scope.$watch(scope.getWindowDimensions, 
             function (newValue, oldValue) {
...
});

Few other things. My layout was sliced to the fixed width of 890px, that's why I took 890 as the pivotal point of my scale ratio formula. You should take the default width of the layout as the base of your calculation.

if (newValue.w > 890) {
  return {};
} else {
  val = newValue.w/890;
  return {
    '-webkit-transform': 'scale(' + val + ')',
  }
});

With the directive in place it's time to plug it in:

 
   
  

Be sure to use style "display:block" or "display:inline-block" and "position:relative" for all the inside components of the scaled elements with the default display. Otherwise they do not obey the scaling enforcement and grow way too long prompting a scrollbar.

It all worked nicely and I was able to enjoy the smoothly resizing layout.


published by noreply@blogger.com (Greg Sabino Mullane) on 2015-01-12 20:07:00 in the "database" category

The popularity of using JSON and JSONB within Postgres has forced a solution to the problem of question mark overload. JSON (as well as hstore) uses the question mark as an operator in its queries, and Perl DBI (esp. DBD::Pg) uses the question mark to indicate a placeholder. Version 3.5.0 of DBD::Pg has solved this by allowing the use of a backslash character before the question mark, to indicate it is NOT a placeholder. We will see some code samples after establishing a little background.

First, what are placeholders? They are special characters within a SQL statement that allow you to defer adding actual values until a later time. This has a number of advantages. First, it completely removes the need to worry about quoting your values. Second, it allows efficient re-use of queries. Third, it reduces network traffic as you do not need to send the entire query each time it is re-run. Fourth, it can allow for seamless translation of data types from Postgres to your client language and back again (for example, DBD::Pg translates easily between Perl arrays and Postgres arrays). There are three styles of placeholders supported by DBD::Pg - question marks, dollar-signs, and colon-names.

Next, what are Postgres operators? They are special symbols withing a SQL statement that perform some action using as inputs the strings to the left and right side of them. It sounds more complicated than it is. Take this query:

SELECT count(*) FROM pg_class WHERE relpages > 24;

In this case, the operator is ">" - the greater than sign. It compares the things on its left (in this case, the value of the relpages column) with the things on its right (in this case, the number 24). The operator will return true or false - in this case, it will return true only if the value on its left is larger than the value on its right. Postgres is extremely extensible, which means it is easy to add all types of new things to it. Adding your own operator is fairly easy. Here's an example that duplicates the greater-than operator, but with a ? symbol:

CREATE OPERATOR ? (procedure=int4gt, leftarg=integer, rightarg=integer);

Now the operator is ready to go. You should be able to run queries like this:

SELECT count(*) FROM pg_class WHERE relpages ? 24;

The list of characters that can make up an operator is fairly small. The documentation has the detailed rules, but the basic list is + - * / < > = ~ ! @ # % ^ & | ` ?. Note that an operator can consist of more than one character, for example, >=

A question mark inside a SQL query can be both a placeholder and an operator, and the driver has no real way to figure out which is which. The first real use of a question mark as an operator was with the geometric operators and then with the hstore module, which allows storing and querying of key/value pairs. It uses a lone question mark to determine if a given value appears as a key in a hstore column. For example, if the goal is to find all rows in which an hstore column contains the value foobar, the SQL would be:

SELECT * FROM mytable WHERE myhstorecol ? 'foobar';

However, if you were to try this via a Perl script using the question-mark placeholder style, DBD::Pg would get confused (and rightly so):

$sth = $dbh->prepare('SELECT * FROM mytable WHERE myhstorecol ? ?');
$sth->execute('foobar');
DBD::Pg::st execute failed: called with 1 bind variables when 2 are needed

Trying to use another placeholder style still does not work, as DBD::Pg still picks it up as a possible placeholder

$sth = $dbh->prepare('SELECT * FROM mytable WHERE myhstorecol ? $1');
$sth->execute('foobar');
Cannot mix placeholder styles "?" and "$1"

A few years ago, a solution was developed: by setting the database handle attribute "pg_placeholder_dollaronly" to true, DBD::Pg will ignore the question mark and only treat dollar-sign numbers as placeholders:

$dbh->{pg_placeholder_dollaronly} = 1;
$sth = $dbh->prepare('SELECT * FROM mytable WHERE myhstorecol ? $1');
$sth->execute('foobar');
## No error!

Then came JSON and JSONB. Just like hstore, they have three operators with question marks in them: ?, ?& and ?| - all of which will prevent the use of question-mark placeholders. However, some frameworks and supporting modules (e.g. SQL::Abstract and DBIx::Class) only support the question mark style of placeholder! Hence, another solution was needed. After some discussion on the dbi-users list, it was agreed that a backslash before a placeholder character would allow that character to be "escaped" and sent as-is to the database (minus the backslash). Thus, as of version 3.5.0 of DBD::Pg, the above query can be written as:

use DBD::Pg 3.5.0;
$SQL = "SELECT * FROM mytable WHERE hstorecol \? ?");
$sth = $dbh->prepare($SQL);
$sth->execute('foobar'); 
# No error!
$SQL = "SELECT * FROM mytable2 WHERE jsoncol \? ?");
$sth = $dbh->prepare($SQL);
$sth->execute('foobar');
# Still no error!

So, a fairly elegant solution. The only caveat is to beware of single and double quotes. The latter require two backslashes, of course. I recommend you always use double quotes and get in the habit of consistently using double backslashes. Not only will you thus never have to worry about single-vs-double, but it adds a nice little visual garnish to help that important backslash trick stand out a little more.

Much thanks to Tim Bunce for reporting this issue, herding it through dbi-users, and helping write the final DBD::Pg solution and code!


published by noreply@blogger.com (Selvakumar Arumugam) on 2015-01-12 15:30:00 in the "AngularJS" category

This is the second part of an article about the conference Open Source India, 2014 was held at Bengaluru, India. The first part is available here. The second day of the conference started with the same excitement level. I plan to attend talks covering Web, Big Data, Logs monitoring and Docker.

Web Personalisation

Jacob Singh started the first talk session with a wonderful presentation along with real-world cases which explained the importance of personalisation in the web. It extended to content personalisation for users and A/B testing (comparing two versions of a webpage to see which one performs better). The demo used the Acquia Lift personalisation module for the Drupal CMS which is developed by his team.

MEAN Stack

Sateesh Kavuri of Yodlee spoke about the MEAN stack which is a web development stack equivalent to popular LAMP stack. MEAN provides a flexible compatibility to web and mobile applications. He explained the architecture of MEAN stack.

He also provided an overview of each component involved in MEAN Stack.

MongoDB - NoSQL database with dynamic schema, in-built aggregation, mapreduce, JSON style document, auto-sharding, extensive query mechanism and high availability.

ExpressJS - A node.js framework to provide features to web and mobile applications.

AngularJS - seamless bi-directional model with extensive features like services and directives.

Node.js - A server side javascript framework with event based programming and single threaded (non blocking I/O with help of request queue).

Sails.js - MEAN Stack provisioner to develop applications quickly.

Finally he demonstrated a MEAN Stack demo application provisioned with help of Sails.js.

Moving fast with high performance Hack and PHP

Dushyant Min spoke about the way Facebook optimised the PHP code base to deliver better performance when they supposed to handle a massive growth of users. Earlier there were compilers HipHop for PHP(HPHPc) or HPHPi(developer mode) to convert the php code to C++ binary and executed to provide the response. After sometime, Facebook developed a new compilation engine called HipHop Virtual Machine(HHVM) which uses Just-In-Time(JIT) compilation approach and converts the code to HipHop ByteCode(HHBC). Both Facebook?s production and development environment code runs over HHVM.
Facebook also created a new language called Hack which is very similar to PHP which added static typing and many other new features. The main reason for Hack is to get the fastest development cycle to add new features and release frequent versions. Hack also uses the HHVM engine.

HHVM engine supports both PHP and Hack, also it provides better performance compare to Zend engine. So Zend Engine can be replaced with HHVM without any issues in the existing PHP applications to get much better performance. It is simple as below:

Also PHP code can be migrated to Hack by changing the <?php tag to <?hh and there are some converters (hackficator) available for code migration. Both PHP and Hack provide almost the same performance on the HHVM engine, but Hack has some additional developer-focussed features.

Application Monitoring and Log Management

Abhishek Dwivedi spoke about a stack to process the logs with various formats, myriad timestamp and no context. He explains a stack of tools to process the logs, store and visualize in a elegant way.

ELK Stack = Elasticsearch, LogStash, Kibana. The architecture and the data flow of ELK stack is stated below:


Elasticsearch - Open source full text search and analytics engine

Log Stash - Open source tool for managing events and logs which has following steps to process the logs

Kibana - seamlessly works with Elasticsearch and provides elegant user interface with various types of graphs

Apache Spark

Prajod and Namitha presented the overview of Apache Spark which is a real time data processing system. It can work on top of Hadoop Distributed FileSystem(HDFS). Apache Spark performs 100x faster in memory and 10x faster in disk compare to Hadoop. It fits with Streaming and Interactive scale of Big Data processing.

Apache Spark has certain features in processing the data to deliver the promising performance:

  • Multistep Directed Acyclic Graph
  • Cached Intermediate Data
  • Resilient Distributed Data
  • Spark Streaming - Adjust batch time to get the near real time data process
  • Implementation of Lambda architecture
  • Graphx and Mlib libraries play an important role

Online Data Processing in Twitter

Lohit Vijayarenu from Twitter spoke about the technologies used at Twitter and their contributions to Open Source. Also he explained the higher level architecture and technologies used in the Twitter microblogging social media platform.

The Twitter front end is the main data input for the system. The Facebook-developed Scribe log servers gather the data from the Twitter front end application and transfers the data to both batch and real time Big Data processing systems. Storm is a real time data processing system which takes care of the happening events at the site. Hadoop is a batch processing system which runs over historical data and generates result data to perform analysis. Several high level abstraction tools like PIG are used write the MR jobs. Along with these frameworks and tools at the high level architecture, there are plenty of Open Source tools used in Twitter. Lohit also strongly mentioned that in addition to using Open Source tools, Twitter contributes back to Open Source.

Docker

Neependra Khare from Red Hat given a talk and demo on Docker which was very interactive session. The gist of Docker is to build, ship and run any application anywhere. It provides good performance and resource utilization compared to the traditional VM model. It uses the Linux core feature called containerization. The container storage is ephemeral, so the important data can be stored in persistent external storage volumes. Slides can be found here.


published by noreply@blogger.com (Brian Gadoury) on 2015-01-12 11:00:00 in the "BigCouch" category

As you may guessed from my perfect tan and rugged good looks, I am Phunk, your river guide. In this multi-part series, I will guide us through an exploration of Elasticsearch, its CouchDB/Bigcouch River plugin, its source, the CouchDB / BigCouch document store, and the surrounding flora and fauna that are the Ruby on Rails based tools I created to help the DPLA project manage this ecosystem.

Before we get our feet wet, let's go through a quick safety briefing to discuss the terms I'll be using as your guide on this trip. Elasticsearch: A schema-less, JSON-based, distributed RESTful search engine. The River: An Elasticsearch plugin that automatically indexes changes in your upstream (heh) document store, in real-time. BigCouch: The, fault-tolerant, clustered flavor of the stand-alone CouchDB document repository. DPLA: The Digital Public Library of America open source project for which all this work was done.

Let's put on our flotation devices, don our metaphor helmets and cast off.


In an Elasticsearch + River + BigCouch architecture, all things flow from the BigCouch. For the DPLA project, we wanted to manage (create, update and delete) documents in our BigCouch document repository and have those changes automagically reflected in our Elasticsearch index. Luckily, BigCouch publishes a real-time stream (heh) of updates to its documents via its cleverly named "_changes" feed. Each change in that feed is published as a stand-alone JSON document. We'll look at that feed in more detail in a bit.

The River bridges (heh) the gap between BigCouch's _changes feed and Elasticsearch index. The plugin runs inside Elasticsearch, and makes a persistent TCP connection to BigCouch's _changes endpoint. When a new change is published to that endpoint, the River passes the relevant portions of that JSON up to Elasticsearch, which then makes the appropriate change to its index. Let's look at a simple timeline of what the River would see from the _changes feed during the creation of a new document in BigCouch, and then an update to that document:

A document is created in BigCouch, the _changes feed emits:

{
  "seq":1,
  "id":"test1",
  "changes":[{"rev":"1-967a00dff5e02add41819138abb3284d"}],
  "doc":{
    "_id":"test1",
    "_rev":"1-967a00dff5e02add41819138abb3284d",
    "my_field":"value1"
  }
}

That same document is updated in BigCouch, the _changes feed emits:

{
  "seq":2,
  "id":"test1",
  "changes":[{"rev":"2-80647a2a9498f5c124b1b3cc1d6c6360"}],
  "doc":{
    "_id":"test1",
    "_rev":"2-80647a2a9498f5c124b1b3cc1d6c6360",
    "my_field":"value2"
  }
}

It's tough to tell from this contrived example document, but the _changes feed actually includes the entire source document JSON for creates and updates. (I'll talk more about that in part 2.) From the above JSON examples, the River would pass the inner-most document containing the _id, _rev and my_field data up to Elasticsearch. Elasticsearch uses that JSON to update the corresponding file (keyed by _id) search index and voila, the document you updated in BigCouch is now updated in your Elasticsearch search index in real-time.

We have now gotten our feet wet with how a document flows from one end to the other in this architecture. In part 2, we'll dive deeper into the DevOps-heavy care, feeding, monitoring and testing of the River. We'll also look at some slick River tricks that can transform your documents before Elasticsearch gets them, and any other silly River puns I can come up with. I'll also be reading the entire thing in my best David Attenborough impression and posting it on SoundCloud.


published by noreply@blogger.com (Greg Sabino Mullane) on 2015-01-08 03:37:00 in the "database" category

How can you tell if your database connection is still valid? One way, when using Perl, is to use the ping() method. Besides backslash-escaped placeholders, a revamped ping() method is the major change in the recently released version 3.5.0 of DBD::Pg, the Perl/DBI interface to Postgres. Before 3.5.0, there was a chance of false positives when using this method. In particular, if you were inside of a transaction, DBD::Pg did not actually attempt to contact the Postgres backend. This was definitely an oversight, and DBD::Pg now does the right thing.

Detecting a dead backend is a little trickier than it sounds. While libpq stores some state information for us, the only way to be sure is to issue a command to the backend. Additionally, we check the value of PQstatus in case libpq has detected a problem. Realistically, it would be far better if the Postgres protocol supported some sort of ping itself, just a simple answer/response without doing anything, but there is nothing like that yet. Fortunately, the command that is issued, /* DBD::Pg ping test, v3.5.0 */, is very lightweight.

One small side effect is that the ping() method (and its stronger cousin, the pg_ping() method) will both cancel any COPY that happens to be in progress. Really, you should not be doing that anyway! :) Calling the next copy command, either pg_getline() or pg_putline(), will tell you if the connection is valid anyway. Since the copy system uses a completely different backend path, this side effect is unavoidable.

Even this small change may cause some problems for applications, which relied on the previous false positive behavior. Leaving as a basic no-op, however, was not a good idea, so check if your application is using ping() sanely. For most applications, simple exception handling will negate to use ping() in the first place.


published by noreply@blogger.com (Greg Sabino Mullane) on 2014-12-22 19:07:00 in the "Bucardo" category
Bucardo is one of the trigger-based replication systems for Postgres (others include Slony and Londiste). All of these not only use triggers to gather information on what has changed, but they also disable triggers when copying things to remote databases. They do this to ensure that only the data itself gets copied, in as fast as manner as possible. This also has the effect of disabling foreign keys, which Postgres implements by use of triggers on the underlying tables. There are times, however, when you need a trigger on a target to fire (such as data masking). Here are four approaches to working around the disabling of triggers. The first two solutions will work with any replication system, but the third and fourth are specific to Bucardo.

First, let's understand how the triggers get disabled. A long time ago (Postgres 8.2 and older), triggers had to be disabled by direct changes to the system catalogs. Luckily, those days are over, and now this is done by issuing this command before copying any data:

SET session_replication_role = 'replica';

This prevents all normal triggers and rules from being activated. There are times, however, when you want certain triggers (or their effects) to execute during replication.

Let's use a simple hypothetical to illustrate all of these solutions. We will start with the Postgres built-in pgbench utility, The initialize option (-i) can be used to create and populate some tables:

$ createdb btest1
$ pgbench -i btest1
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.16 s, remaining 0.00 s).
vacuum...
set primary keys...
done.

We want to replicate all four of the tables pgbench just created. Bucardo requires that a table have a primary key or a unique index to be replicated, so we will need to make an immediate adjustment to the pgbench_history table:

$ psql btest1 -c 'ALTER TABLE pgbench_history ADD hid SERIAL PRIMARY KEY'
ALTER TABLE

Now to make things a little more interesting. Let's add a new column to the pgbench_accounts table named "phone", which will hold the account owner's phone number. As this is confidential information, we do not want it to be available - except on the source database! For this example, database btest1 will be the source, and database btest2 will be the target.

$ psql btest1 -c 'ALTER TABLE pgbench_accounts ADD phone TEXT'
ALTER TABLE
$ createdb btest2 --template=btest1

To prevent the phone number from being revealed to anyone querying btest2, a trigger and supporting function is used to change the phone number to always display the word 'private'. Here is what they look like.

btest2=# CREATE OR REPLACE FUNCTION elide_phone()
  RETURNS TRIGGER
  LANGUAGE plpgsql
  AS $bc$
BEGIN
  NEW.phone = 'private';
  RETURN NEW;
END;
$bc$;
CREATE FUNCTION

btest2=# CREATE TRIGGER elide_phone
  BEFORE INSERT OR UPDATE
  ON pgbench_accounts
  FOR EACH ROW
  EXECUTE PROCEDURE elide_phone();
CREATE TRIGGER

Now that everything is setup, we can install Bucardo and teach it how to replicate those tables:

$ bucardo install 
This will install the bucardo database into ...
...
Installation is now complete.

$ bucardo add db A,B dbname=btest1,btest2
Added databases "A","B"

$ bucardo add sync pgb dbs=A,B tables=all
Added sync "pgb"
Created a new relgroup named "pgb"
Created a new dbgroup named "pgb"
  Added table "public.pgbench_accounts"
  Added table "public.pgbench_branches"
  Added table "public.pgbench_history"
  Added table "public.pgbench_tellers"

$ bucardo start

A demonstration of the new trigger is now in order. On the database btest2, we will update a few rows and attempt to set the phone number. However, our new trigger will overwrite our changes:

$ psql btest2 -c "update pgbench_accounts set abalance=123, phone='867-5309' where aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone  
-----+----------+---------
   1 |      123 | private
   2 |      123 | private
   3 |      123 | private

So, all is as we expected: any changes made to this table have the phone number changed. Let's see what happens when the changes are done via Bucardo replication. Note that we are updating btest1 but querying btest2:

$ psql btest1 -c "update pgbench_accounts set abalance=99, phone='867-5309' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       99 | 867-5309
   2 |       99 | 867-5309
   3 |       99 | 867-5309

As you can see, our privacy safeguard is gone, as Bucardo disables the trigger on btest2 before making the changes. So what can we do? There are four solutions: set the trigger as ALWAYS, set the trigger as REPLICA, use Bucardo's customcode feature, or use Bucardo's customcols feature.

Solution one: ALWAYS trigger

The easiest way is to simply mark the trigger as ALWAYS, which means that it will always fire, regardless of what session_replication_role is set to. This is the best solution for most problems of this sort. Changing the trigger requires an ALTER TABLE command. Once done, psql will show you the new state of the trigger as well:

btest2=# d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers 
----------+---------------+-----------
 aid      | integer       | not null
 bid      | integer       | 
 abalance | integer       | 
 filler   | character(84) | 
 phone    | text          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Triggers:
    elide_phone BEFORE INSERT OR UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE elide_phone()

btest2=# ALTER TABLE pgbench_accounts ENABLE ALWAYS TRIGGER elide_phone;
ALTER TABLE

btest2=# d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers 
----------+---------------+-----------
 aid      | integer       | not null
 bid      | integer       | 
 abalance | integer       | 
 filler   | character(84) | 
 phone    | text          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Triggers firing always:
    elide_phone BEFORE INSERT OR UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE elide_phone()

That is some ugly syntax for changing the triggers, eh? (To restore a trigger to its default state, you would simply leave out the ALWAYS clause, so it becomes ALTER TABLE pgbench_accounts ENABLE TRIGGER elide_phone). Time to verify that the ALWAYS trigger fires even when Bucardo is updating the table:

$ psql btest1 -c "update pgbench_accounts set abalance=11, phone='555-2368' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       11 | private
   2 |       11 | private
   3 |       11 | private

Solution two: REPLICA trigger

Trigger-based replication solutions, you may recall from above, issue this command: SET session_replication_role = 'replica'. What this means is that all rules and triggers that are not of type replica are skipped (with the exception of always triggers of course). Thus, another solution is to set the triggers you want to fire to be of type "replica". Once you do this, however, the triggers will NOT fire in ordinary use - so be careful. Let's see it in action:

btest2=# ALTER TABLE pgbench_accounts ENABLE REPLICA TRIGGER elide_phone;
ALTER TABLE

btest2=# d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers 
----------+---------------+-----------
 aid      | integer       | not null
 bid      | integer       | 
 abalance | integer       | 
 filler   | character(84) | 
 phone    | text          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Triggers firing on replica only:
    elide_phone BEFORE INSERT OR UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE elide_phone()

As before, we can test it out and verify the trigger is firing:

$ psql btest1 -c "update pgbench_accounts set abalance=22, phone='664-7665' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       22 | private
   2 |       22 | private
   3 |       22 | private

Solution three: Bucardo customcode

Bucardo supports a number of hooks into the replication process. These are called "customcodes" and consist of Perl code that is invoked by Bucardo. To solve the problem at hand, we will create some code for the "code_before_trigger_enable" hook - in other words, right after the actual data copying is performed. To create the customcode, we write the actual code to a text file, then do this:

$ bucardo add code nophone whenrun=before_trigger_enable sync=pgb src_code=./nophone.pl

This creates a new customcode named "nophone" that contains the code inside the local file "nophone.pl". It runs after the replication, but before the triggers are re-enabled. It is associated with the sync named "pgb". The content of the file looks like this:

my $info = shift;

return if ! exists $info->{rows};

my $schema = 'public';
my $table = 'pgbench_accounts';
my $rows = $info->{rows};
if (exists $rows->{$schema} and exists $rows->{$schema}{$table}) {
  my $dbh = $info->{dbh}{B};
  my $SQL = "UPDATE $schema.$table SET phone=? "
    . "WHERE aid = ? AND phone <> ?";
  my $sth = $dbh->prepare($SQL);
  my $string = 'private';
  for my $pk (keys %{ $rows->{$schema}{$table} }) {
    $sth->execute($string, $pk, $string);
  }
}
return;

Every customcode is passed a hashref of information from Bucardo. One of the things passed in a list of changed rows. At the top, we see that we exit right away (via return, as the customcodes become Perl subroutines) if there are no rows this round. Then we check that something has changed for the pgbench_accounts tables. We grab the database handle, also passed to the subroutine. Note that this is actually a DBIx::Safe handle, not a direct DBI handle. The difference is that certain operations, such as commit, are not allowed.

Once we have the handle, we walk through all the rows that have changed, and set the phone to something safe. The above code is a good approach, but we can make the UPDATE much smarter because we are using a modern Postgres which supports ANY, and a modern DBD::Pg that supports passing Perl arrays in and out. Once we combine those two, we can move the execute() out of the loop into a single call like so:

...
if (exists $rows->{$schema} and exists $rows->{$schema}{$table}) {
  my $dbh = $info->{dbh}{B};
  my $SQL = "UPDATE $schema.$table SET phone=?"
    . "WHERE aid = ANY(?) AND phone <> ?";
  my $sth = $dbh->prepare($SQL);
  my $string = 'private';
  $sth->execute($string, [ keys %{ $rows->{$schema}{$table} } ], $string);
}

Note that this solution requires Bucardo version 5.3.0 or better. Let's verify it:

$ psql btest1 -c "update pgbench_accounts set abalance=33, phone='588-2300' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       33 | private
   2 |       33 | private
   3 |       33 | private

Solution four: Bucardo customcols

The final way to keep the information in that column masked is to use Bucardo's 'customcols' feature. This allows rewriting of the command that grabs rows from the source databases. Bucardo uses COPY to grab rows from a source, DELETE to remove the rows if they exist on the target, and another COPY to add the rows to the target tables. Postgres supports adding a SELECT clause to a COPY command, as we will see below. To hide the values of the phone column using the customcols feature, we simply do:

$ bucardo add customcols public.pgbench_accounts "select aid,bid,abalance,filler,'private' as phone" db=B sync=pgb
New columns for public.pgbench_accounts: "select aid,bid,abalance,filler,'private' as phone" (for database B) (for sync pgb)

The list of columns must be the same as in the original table, but we can modify things! So rather than Bucardo doing this:

COPY (SELECT * FROM public.pgbench_accounts WHERE aid IN (1,2,3)) TO STDOUT

Bucardo will instead do this thanks to our customcols:

COPY (SELECT aid,bid,abalance,filler,'private' as phone FROM public.pgbench_accounts WHERE aid IN (1,2,3)) TO STDOUT

Let's verify it:

$ psql btest1 -c "update pgbench_accounts set abalance=44, phone='736-5000' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       44 | private
   2 |       44 | private
   3 |       44 | private

Those are the four approaches to firing (or emulating) triggers when using replication. Which one you choose depends on what exactly your trigger does, but overall, the best solution is probably the 'trigger ALWAYS', followed by 'Bucardo customcols'. If you have another solution, or some problem that is not covered by the above, please let me know in the comments.


published by noreply@blogger.com (Brian Gadoury) on 2014-12-05 21:43:00 in the "AngularJS" category

Seeing the proposed line-up for the 2014 hack.summit() virtual conference was the grown-up equivalent of seeing the line-up for some of the first Lollapalooza events. It was definitely an "All those people I want to see, and all in one place? *head asplode*" moments.

So, what is this conference with the incredibly nerdy name? In short, it's a selection of industry leading speakers presenting all on-line and streamed live. The "registration fee" was actually a choice between mentioning the conference on a few social media platforms, or making a donation to one of a number of programming non-profits. Seeing as I don't tweet, I made a donation, then signed in (using OAuth) via my Google+ account. It was a delightfully frictionless process.

The hack.summit() conference ran December 1st through December 4th, but I was only able to "attend" the last two days. Luckily for me, all of the live-streamed presentations are also available afterwards on the hacksummit site. They feel a little hidden away in the small menu in the upper left corner, but they're all there, available as YouTube videos.

So, why is was hack.summit() worth your time? It's got an amazing collection of very accomplished developers, thought leaders and experienced big cheeses of some companies that do some pretty impressive work. During the live event, the Crowdcast platform provided a great delivery mechanism for the streaming videos, as well as admin-created polls, a light-weight chat feature, and audience-voted questions for the presenters. Hack.summit() founder, Ed Roman, did a great job MC-ing the entire event, too. (And to whoever figured out how to game the voting system at a conference named hack.summit(), well played you rogue.)

In closing, I strongly recommend you do a few things: Go sign up right now to gain access to the presentation videos. Commit some time (make a deal with yourself, get approval to do a group viewing at work, whatever) to watch as many presentations as you can. Lastly, set a calendar reminder to keep an eye out for the hack.summit() 2015 that will hopefully happen.