All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors, or our affiliates.
  Add to My Yahoo!  Subscribe with Bloglines  Subscribe in NewsGator Online

published by noreply@blogger.com (Greg Sabino Mullane) on 2016-02-09 12:00:00 in the "apache" category

tl;dr - don't use a high bcrypt cost with HTTP basic auth!

Recently we had a client approach us with reports of a "slow" wiki experience. This was for a MediaWiki we recently installed for them; there were no fancy extensions, and the hardware, the OS, and the Apache web server were solid, perfectly normal choices. I was tasked to dive in and solve this issue.

The first step in any troubleshooting is to verify and duplicate the problem. While the wiki did feel a bit sluggish, it was not as bad as the reports we were getting of taking over 15 seconds to view a page. A side-by-side comparison with a similar wiki seemed a good place to start. I called up the main wiki page on both the client wiki and End Point's internal wiki. Both were running the latest version of MediaWiki, had the same type of servers (located a similar distance from me), were using the same version of Apache, and had roughly the same server load. While both wiki's pages had roughly the same amount of content, the client one loaded noticeably slower. It took less than a second for the End Point wiki, and around ten seconds for the client one!

The first culprit was MediaWiki itself. Perhaps something was misconfigured there, or some extension was slowing everything down? MediaWiki has good debugging tools. Inside the both wiki's LocalSettings.php file I turned on debugging temporarily with:

$wgDebugLogFile         = '/tmp/mediawiki.debug';
$wgDebugDBTransactions  = true;
$wgDebugDumpSql         = true;
$wgDebugTimestamps      = true;

I reloaded the page, then commented out the $wgDebugLogFile line to stop it from growing large (the debug output can be quite verbose!). Here's some snippets from the generated log file:

0.9151   4.2M  Start request GET /wiki/Main_Page
...
[caches] main: SqlBagOStuff, message: SqlBagOStuff, parser: SqlBagOStuff
[caches] LocalisationCache: using store LCStoreDB
0.9266   9.2M  Implicit transaction open enabled.
0.9279   9.2M  Query wikidb (1) (slave): SET /* DatabasePostgres::open  */ client_encoding='UTF8'
0.9282   9.2M  Resource id #127: Transaction state changed from IDLE -> ACTIVE
0.9268   9.2M  Query wikidb (2) (slave): SET /* DatabasePostgres::open  */ datestyle = 'ISO, YMD'
...
0.9587   9.2M  Query wikidb (11) (slave): SELECT /* LCStoreDB::get  */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'deps'  LIMIT 1
0.9573   9.5M  Query wikidb (12) (slave): SELECT /* LCStoreDB::get  */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'list'  LIMIT 1
0.9567  10.8M  Query wikidb (13) (slave): SELECT /* LCStoreDB::get  */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'preload'  LIMIT 1
0.9572  10.8M  Query wikidb (14) (slave): SELECT /* LCStoreDB::get  */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'preload'  LIMIT 1
...
0.9875  21.2M  Query wikidb (195) (slave): SELECT /* LCStoreDB::get Greg */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'messages:accesskey-pt-mycontris'  LIMIT 1
0.9873  21.2M  Query wikidb (196) (slave): SELECT /* LCStoreDB::get Greg */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'messages:tooltip-pt-logout'  LIMIT 1
0.9868  21.2M  Query wikidb (197) (slave): SELECT /* LCStoreDB::get Greg */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'messages:accesskey-pt-logout'  LIMIT 1
0.9883  21.2M  Query wikidb (198) (slave): SELECT /* LCStoreDB::get Greg */  lc_value  FROM "l10n_cache"   WHERE lc_lang = 'en' AND lc_key = 'messages:vector-more-actions'  LIMIT 1

Just to load a simple page, there were 194 SELECT statements! And 137 of those were trying to look in the l10n_cache table, one row at a time. Clearly, there is lots of room for improvement there. Someday, I may even jump in and tackle that. But for now, despite being very inefficient, it is also very fast. Because of the $wgDebugTimestamps, it was easy to compute how much time both wikis spent actually creating the page and sending it back to Apache. In this case, the difference was minimal, which meant MediaWiki was not the culprit.

I then turned my attention to Apache. Perhaps it was compiled differently? Perhaps there was some obscure SSL bug slowing things down for everyone? These were unlikely, but it was worth checking the Apache logs (which were in /var/log/httpd). There are two main logs Apache uses: access and error. The latter revealed nothing at all when I loaded the main wiki page. The access logs looked fairly normal:

85.236.207.120 - greg [19/Jan/2016:12:23:21 -0500] "GET /wiki/Main_Page HTTP/1.1" 200 23558 "-" "Mozilla/5.0 Firefox/43.0"
85.236.207.120 - greg [19/Jan/2016:12:23:22 -0500] "GET /mediawiki/extensions/balloons/js/balloon.config.js HTTP/1.1" 200 4128 "https://wiki.endpoint.com/wiki/Main_Page
" "Mozilla/5.0 Firefox/43.0"
...
85.236.207.120 - greg [19/Jan/2016:12:23:22 -0500] "GET /mediawiki/load.php?debug=false&lang=en&modules=mediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.sectionAnchor%7Cmediawiki.skinning.interface%7Cskins.vector.styles&only=styles&skin=vector HTTP/1.1" 200 58697 "https://wiki.endpoint.com/wiki/Main_Page" "Mozilla/5.0 Firefox/43.0"
85.236.207.120 - greg [19/Jan/2016:12:23:22 -0500] "GET /mediawiki/resources/assets/poweredby_mediawiki_88x31.png HTTP/1.1" 200 3525 "https://wiki.endpoint.com/wiki/Main_Page" "Mozilla/5.0 Firefox/43.0"

Still nothing out of the ordinary. What to do next? When all else fails, go to the system calls. It's about as close to bare metal as you can easily get on a Linux system. In this case, I decided to run strace on the Apache daemon to see exactly where the time was being spent. As expected, there were a large handful of httpd processes already spawned and waiting for a connection. While there was no way to know which one would field my requests, some shell-fu allowed me to strace them all at once:

## The -u prevents us from picking the parent httpd process, because it is owned by root!
$ strace -o greg.httpd.trace -tt -ff `pgrep -u apache httpd | xargs -n 1 echo -p | xargs`
Process 5148 attached
Process 4848 attached
Process 5656 attached
Process 4948 attached
Process 5149 attached
Process 5148 attached
Process 4858 attached
Process 5657 attached
Process 4852 attached
Process 4853 attached
^CProcess 5148 detached
Process 4848 detached
Process 5656 detached
Process 4948 detached
Process 5149 detached
Process 5148 detached
Process 4858 detached
Process 5657 detached
Process 4852 detached
Process 4853 detached

Looking at one of the output of one of these revealed some important clues:

$ head greg.httpd.trace.4948
13:00:28.799807 read(14, "27332221135-3323123(200302"251'g256363b5"..., 8000) = 666
13:00:28.799995 stat("/wiki/htdocs/mediawiki/load.php", {st_mode=S_IFREG|0644, st_size=1755, ...}) = 0
13:00:28.800126 open("/wiki/htpasswd.users", O_RDONLY|O_CLOEXEC) = 15
13:00:28.800176 fstat(15, {st_mode=S_IFREG|0640, st_size=947, ...}) = 0
13:00:28.800204 read(15, "alice:{SHA}jA0EAgMCMEpo4Wa3n/9gy"..., 4096) = 2802
13:00:28.800230 close(15)               = 0
13:00:29.496369 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={60, 0}}, NULL) = 0
13:00:29.496863 rt_sigaction(SIGPROF, {0x7fc962da7ab0, [PROF], SA_RESTORER|SA_RESTART, 0x7fc970605670}, {0x7fc962da7ab0, [PROF], SA_RESTORER|SA_RESTART, 0x7fc970605670

Aha! If you look close at those timestamps, you will notice that the time gap from the call to close() and the subsequent setitimer() is quite large at .69 seconds. That's a long time for Apache to be waiting around for something. The second clue is the file it just opened: "htpasswd.users". Seeing the top of the file, with the {SHA} in quotes, made me realize the problem - htpasswd files now support bcrypt as an authentication method, and bcrypt is designed to be secure - and slow. Sure enough, the htpasswd file had bcrypt entries with a high cost for the people that were having the most issues with the speed. This is what the file looked like (names and values changed):

alice:{SHA}jA0EAgMCMEpo4Wa3n/9gybBBsDPa
greg$2y$13$+lE6+EwgtzP0m8K8VQDnYMRDRMf6rNMRZsCzko07QQpskKI9xbb/y9
mallory$2y$15$ww8Q4HMI1Md51kul2Hiz4ctetPqJ95cmspH8T81JHfqRvmg===rVgn
carol:7RnEKJWc38uEO
bob$apr1$uKX9Z63CqPOGX4lD1R4yVZsloJyZGf+
jon$2y$08$SUe3Z8sgEpyDWbWhUUUU5wtVTwlpEdc7QyXOg3e5WBwM4Hu35/OSo1
eve$apr1$I/hv09PcpU0VfXhyG7ZGaMz7Vhxi1Tm

I recognized the bcrypt format right away ($2y$13$). The people who were complaining the most (e.g. mallory in the example above) about the speed of the wiki had the highest costs, while those with low costs (e.g. jon), and those using something other than bcrypt (everyone else above), were not complaining at all! The 'cost' is the number after the second dollar sign: as you can see, some of them had a cost of 15, which is much more expensive than a cost of 13, which is what my user ("greg") was using. This was a smoking gun, but one more step was needed for proof. I adjusted the cost of my password to something low using the htpasswd program:

$ htpasswd -B -C 6 /wiki/htpasswd.users greg
New password: 
Re-type new password: 
Updating password for user greg

Voila! The page loaded in a flash. I then changed the cost to 15 and suddenly the wiki was even slower than before - taking upwards of 15 seconds to load the main page of the wiki. Mystery solved. All those high cost bcrypt requests are also not good for the server: not only does it use a lot of CPU, but ends up keeping the Apache daemon tied up waiting for the bcrypt to finish, rather than simply finishing up quickly and going back to the main pool.

You may be asking a few questions at this point, however. Why would htpasswd offer a footgun like this? Why such a radical difference in effect for slightly different costs? Is bcrypt a good practice for a htpasswd file? Let's attempt to answer those. Before we do, we have to learn a little bit about bcrypt and passwords in general. Some of this is purposefully oversimplified, so be gently in the comments. :)

Passwords themselves are never stored on a server (aka the machine doing the authentication). Instead, the server stores a hash of the password. This is created by what is known as a "one-way" function, that creates a unique fingerprint of your password. If this fingerprint (aka hash) is discovered, there is no direct way to see the password that created it. When you login to a site, it creates a hash of the password you give it, then compares that hash to the one it has stored. Thus, it can verify that you have given it the correct password without actually having to store the password.

For a long time, very simple algorithms were used to create these hashes. However, as computers became more powerful, and as the field of cryptography advanced, it became easier to "crack" these hashes and determine the password that was used to create them. This was an important problem, and one of the solutions that people came up with was the bcrypt algorithm, which makes the computation of the hash very expensive, in terms of computer speed. Furthermore, that speed is adjustable, and determined by the "cost" given at creation time. You may have noticed the -C option I used in the htpasswd example above. That number indicates the number of rounds the algorithm must go through. However, the cost given leads to 2^code rounds, which means that the cost is exponential. In other words, a cost of 13 means that bcrypt runs 2 to the 13th power rounds, or 8,192 rounds. A cost of 14 is 2 to the 14th power, or 16,384 rounds - twice as slow as a cost of 13! A cost of 15 is 32,768 rounds, etc. Thus, one can see why even a cost of 15 would be much slower than a cost of 13.

A web page usually returns more than just the requested HTML. There are commonly images, CSS, and javascript that must also be loaded from the webserver to fully render the page. Each of these requests must go through basic auth, and thus get slowed down by bcrypt. This is why even though each basic authentication via bcrypt of 15 only takes a couple of seconds, the entire web page can take much longer.

What encryption options are available for htpasswd program? The bcrypt option was introduced without much fanfare in version 2.4.4 of Apache, which was released on February 25, 2013. So, it's been around a while. The output of --help shows us that bcrypt is the only secure one, but allows for other legacy ones to be used. Also note that the range of costs for bcrypt range from 4 to 31:

 -m  Force MD5 encryption of the password (default).
 -B  Force bcrypt encryption of the password (very secure).
 -C  Set the computing time used for the bcrypt algorithm
     (higher is more secure but slower, default: 5, valid: 4 to 31).
 -d  Force CRYPT encryption of the password (8 chars max, insecure).
 -s  Force SHA encryption of the password (insecure).
 -p  Do not encrypt the password (plaintext, insecure).

So should you use bcrypt for your htpasswd? Absolutely yes. Even a lower cost bcrypt is incredibly more secure than using MD5, CRYPT, or SHA. A cost of 10 is roughly the same speed as those, but a much, much better choice. You can measure the time it takes to create or update your password via the command-line htpasswd command to get a rough idea of how much impact it will have on your website. You can use the time it takes to run the htpasswd command as rough proxy for the total page load time. Here are some numbers I generated on my local box. Numbers represent the average of a few runs:

Bcrypt cost htpasswd creation time Web page load time
100.0795.68 seconds
120.2686.77 seconds
140.97910.78 seconds
163.68425.72 seconds
1814.68388.85 seconds
2058.680358.80 seconds
22236.3691357.82 seconds
31186,173 seconds
(51 hours and 42 minutes!!)
Ah...no

There are times where you really do want a higher bcrypt cost. The basic auth usage in this scenario is really the exception, and not the norm. In most cases, a password will be used to log in to something, and you will either create a persistent connection (e.g. SSH), or a cookie with a temporary token will be issued (e.g. almost every website in the world). In those cases, a few seconds delay are quite acceptable, as it is a rare event.

So why do we even care about passwords so much, especially for something like basic auth and a htpasswd file? After all, if someone can view the contents of the htpasswd file, they can also more than likely view whatever material on the web server it was designed to protect. These days, however, it's important to view strong hashes such as bcrypt as not just protecting data, but protecting the password as well. Why? Password reuse. It's very common for people to use the same (or very similar) password on all the sites they visit. The danger is thus not that an attacker can view the file contents protected by the htpasswd file, but that an attacker can use that password on the user's email accounts, or on other sites the user may have visited and used the same password.

What bcrypt cost should you use? The general answer is to use the highest possible cost you can get away with. Take something with such a high cost that is causes discomfort to the users, then dial it back a tiny bit. Measure it out and see what your server can handle. For general bcrypt use, start with 13, but don't be afraid to keep going up until it takes a wall clock second or two to run. For basic auth, use something very fast: perhaps 9 or less. Anything that takes over a second to create via htpasswd will slow a site down noticeably!


published by noreply@blogger.com (Patrick Lewis) on 2016-02-08 15:38:00 in the "css" category
The Zurb Foundation 5 front-end framework provides many convenient features such as the ability to control the visibility of HTML elements for different browser window sizes using CSS classes. Foundation CSS classes like 'show-for-small-only' and 'hide-for-large-up' (full list at http://foundation.zurb.com/sites/docs/v/5.5.3/components/visibility.html) make it easy to add mobile-specific content to your page or prevent certain page elements from being displayed on mobile devices.

Having an easy way to show/hide elements based on viewport size is nice, but what if you want to style an element differently based on the size of the browser that's viewing the page? Foundation has you covered there, too, though the method is less obvious. It's possible to use Foundation's media query SCSS variables when writing your own custom styles in order to apply different styling rules for different viewport sizes.

For example, if you have an element that you want to offset with a margin in a large window but be flush with the left edge of a small window, you can use the Foundation media query variables to apply a styling override that's specific to small windows:

#element { 
    margin-left: 100px;
 
    @media #{$small-only} {
        margin-left: 0;
    }
}

This will apply a specific set of styling to the element for small viewports and a different set for medium and larger viewports (with the definitions for 'small', 'medium', etc. corresponding to the same values used by Foundation's visibility classes like 'show-for-small-only' which were mentioned at the start of this post).

It wasn't immediately obvious to me how to apply conditional styling using Foundation's own definitions of small, medium, etc. viewport sizes but luckily the variable definitions provided by the SCSS framework make it easy to do so.

published by noreply@blogger.com (Greg Sabino Mullane) on 2016-01-28 17:49:00 in the "database" category

The recent release of Postgres 9.5 has many people excited about the big new features such as UPSERT (docs) and row-level security (docs). Today I would like to celebrate three of the smaller features that I love about this release.

Before jumping into my list, I'd like to thank everyone who contributes to Postgres. I did some quick analysis and found that 85 people, from Adrien to Zeus, have helped version 9.5 of Postgres, at least according to the git logs. Of course, that number is actually higher, as it doesn't take into account people helping out on the #postgresql channel, running buildfarm animals, doing packaging work, keeping the infrastructure running, etc. Thanks to you all!

Feature: REINDEX VERBOSE

The first feature is one I've been wishing for a long time - a verbose form of the REINDEX command. Thanks to Sawada Masahiko for adding this. Similar to VACUUM, REINDEX gets kicked off and then gives no progress or information until it finishes. While VACUUM has long had the VERBOSE option to get around this, REINDEX gives you no clue to which index it was working on, or how much work each index took to rebuild. Here is a normal reindex, along with another 9.5 feature, the ability to reindex an entire schema:

greg=# reindex schema public;
## What seems like five long minutes later...
REINDEX

The new syntax uses parenthesis to support VERBOSE and any other future options. If you are familiar with EXPLAIN's newer options, you may see a similarity. More on the syntax in a bit. Here is the much improved version in action:

greg=# reindex (verbose) schema public;
INFO:  index "foobar_pkey" was reindexed
DETAIL:  CPU 11.00s/0.05u sec elapsed 19.38 sec.
INFO:  index "foobar_location" was reindexed
DETAIL:  CPU 5.21s/0.05u sec elapsed 18.27 sec.
INFO:  index "location_position" was reindexed
DETAIL:  CPU 9.10s/0.05u sec elapsed 19.70 sec.
INFO:  table "public.foobar" was reindexed
INFO:  index "foobaz_pkey" was reindexed
DETAIL:  CPU 7.04s/0.05u sec elapsed 19.61 sec.
INFO:  index "shoe_size" was reindexed
DETAIL:  CPU 12.26s/0.05u sec elapsed 19.33 sec.
INFO:  table "public.foobaz" was reindexed
REINDEX

Why not REINDEX VERBOSE TABLE foobar? Or even REINDEX TABLE foobar WITH VERBOSE? There was a good discussion of this on pgsql-hackers when this feature was being developed, but the short answer is that parenthesis are the correct way to do things moving forward. Robert Haas summed it up well:

"
The unparenthesized VACUUM syntax was added back before we realized that that kind of syntax is a terrible idea. It requires every option to be a keyword, and those keywords have to be in a fixed order. I believe the intention is to keep the old VACUUM syntax around for backward-compatibility, but not to extend it. Same for EXPLAIN and COPY.
"

The psql help option should show the new syntax:

greg=# h REINDEX
Command:     REINDEX
Description: rebuild indexes
Syntax:
REINDEX [ ( { VERBOSE } [, ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name

Feature: pg_ctl defaults to "fast" mode

The second feature in Postgres 9.5 I am happy about is the change in niceness of pg_ctl from "smart" mode to "fast" mode. The help of pg_ctl explains the different modes fairly well:

pg_ctl is a utility to initialize, start, stop, or control a PostgreSQL server.

Usage:
  pg_ctl stop    [-W] [-t SECS] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]
...
Shutdown modes are:
  smart       quit after all clients have disconnected
  fast        quit directly, with proper shutdown
  immediate   quit without complete shutdown; will lead to recovery on restart

In the past, the default was 'smart'. Which often means your friendly neighborhood DBA would type "pg_ctl restart -D data", then watch the progress dots slowly marching across the screen, until they remembered that the default mode of "smart" is kind of dumb - as long as there is one connected client, the restart will not happen. Thus, the DBA had to cancel the command, and rerun it as "pg_ctl restart -D data -m fast". Then they would vow to remember to add the -m switch in next time. And promptly forget to the next time they did a shutdown or restart. :) Now pg_ctl has a much better default. Thanks, Bruce Momjian!

Feature: new 'cluster_name' option

When you run a lot of different Postgres clusters on your server, as I tend to do, it can be hard to tell them apart as the version and port are not reported in the ps output. I sometimes have nearly a dozen different clusters running, due to testing different versions and different applications. Similar in spirit to the application_name option, the new cluster_name option solves the problem neatly by allowing a custom string to be put in to the process title. Thanks to Thomas Munro for inventing this. So instead of this:

greg      7780     1  0 Mar01 pts/0    00:00:03 /home/greg/pg/9.5/bin/postgres -D data
greg      7787  7780  0 Mar01 ?        00:00:00 postgres: logger process   
greg      7789  7780  0 Mar01 ?        00:00:00 postgres: checkpointer process   
greg      7790  7780  0 Mar01 ?        00:00:09 postgres: writer process   
greg      7791  7780  0 Mar01 ?        00:00:06 postgres: wal writer process   
greg      7792  7780  0 Mar01 ?        00:00:05 postgres: autovacuum launcher process   
greg      7793  7780  0 Mar01 ?        00:00:11 postgres: stats collector process  
greg      6773     1  0 Mar01 pts/0    00:00:02 /home/greg/pg/9.5/bin/postgres -D data2
greg      6780  6773  0 Mar01 ?        00:00:00 postgres: logger process   
greg      6782  6773  0 Mar01 ?        00:00:00 postgres: checkpointer process   
greg      6783  6773  0 Mar01 ?        00:00:04 postgres: writer process   
greg      6784  6773  0 Mar01 ?        00:00:02 postgres: wal writer process   
greg      6785  6773  0 Mar01 ?        00:00:02 postgres: autovacuum launcher process   
greg      6786  6773  0 Mar01 ?        00:00:07 postgres: stats collector process

One can adjust the cluster_name inside each postgresql.conf (for example, to 'alpha' and 'bravo'), and get this:

greg      8267     1  0 Mar01 pts/0    00:00:03 /home/greg/pg/9.5/bin/postgres -D data
greg      8274  8267  0 Mar01 ?        00:00:00 postgres: alpha: logger process   
greg      8277  8267  0 Mar01 ?        00:00:00 postgres: alpha: checkpointer process   
greg      8278  8267  0 Mar01 ?        00:00:09 postgres: alpha: writer process   
greg      8279  8267  0 Mar01 ?        00:00:06 postgres: alpha: wal writer process   
greg      8280  8267  0 Mar01 ?        00:00:05 postgres: alpha: autovacuum launcher process   
greg      8281  8267  0 Mar01 ?        00:00:11 postgres: alpha: stats collector process  
greg      8583     1  0 Mar01 pts/0    00:00:02 /home/greg/pg/9.5/bin/postgres -D data2
greg      8590  8583  0 Mar01 ?        00:00:00 postgres: bravo: logger process   
greg      8592  8583  0 Mar01 ?        00:00:00 postgres: bravo: checkpointer process   
greg      8591  8583  0 Mar01 ?        00:00:04 postgres: bravo: writer process   
greg      8593  8583  0 Mar01 ?        00:00:02 postgres: bravo: wal writer process   
greg      8594  8583  0 Mar01 ?        00:00:02 postgres: bravo: autovacuum launcher process   
greg      8595  8583  0 Mar01 ?        00:00:07 postgres: bravo: stats collector process

There are a lot of other things added in Postgres 9.5. I recommend you visit this page for a complete list, and poke around on your own. A final shout out to all the people that are continually improving the tab-completion of psql. You rock.


published by noreply@blogger.com (Jeff Boes) on 2016-01-27 14:00:00 in the "Interchange" category

This is an article aimed at beginner-to-intermediate Interchange developers.

A typical approach to a hierarchical Interchange site is:

Categories -> Category -> Product

I.e., you list all your categories as links, each of which opens up a search results page filtering the products by category, with links to the individual product pages via the flypage.

Recently I upgraded a site so the category URLs were a bit more SEO-friendly. The original category filtering search produced these lovely specimens:

/search.html?fi=products&st=db&co=1&sf=category&se=Shoes&op=rm
   &sf=inactive&se=yes&op=ne&tf=category&ml=100

but what I really wanted was:

/cat/Shoes.html

Such links are easier to communicate to users, more friendly to search engines, less prone to breakage (e.g., by getting word-wrapped in email clients), and avoid exposing details of your application (here, we've had to admit publicly that we have a table called "products" and that some items are "inactive"; a curious user might decide to see what happens if they change "sf=inactive&se=yes" to some other expression).

Here's how I attacked this.

Creating a category listing page

First, I copied my "results.html" page to "catpage.html". That way, my original search results page can continue to serve up ad hoc search results.

The search results were displayed via:

[search-region]
...
[/search-region]

I converted this to a database query:

[query sql="SELECT * FROM products WHERE NOT inactive AND category = [sql-quote][cgi category][/sql-quote]"
 type=list prefix=item]
...
[/query]

I chose to use a prefix other than the default since it would avoid having to change so many tags in the page, and now both the original search page and new catpage would look much the same internally (and thus, if desired, I could refactor them in the future).

Note that I've defined part of the API for this page: the category to be searched is set in a CGI variable called "category".

In my specific case, there was additional tinkering with this tag, because I had nested [query] tags already in the page within the search-region.

Creating a "cat" actionmap

In order to translate a URL containing SEO-friendly "/cat/Shoes.html" into my search, I need an actionmap. Here's mine; it's very simple.

Actionmap cat <<"CODE"
sub {
  my $url = shift;
  my @url_parts = split '/' => $url;
  shift @url_parts if $url_parts[0] eq 'cat';

  $CGI->{mv_nextpage} = 'catpage.html';
  $CGI->Interchange = shift @url_parts;
  return 1;
}
CODE

Actionmaps are called when Interchange detects that a URL begins with the actionmap's name; here "cat". They are passed a parameter containing the URL fragment (after removing all the site stuff). Here, that would be (e.g.) "/cat/Shoes". We massage the URL to get our category code, and set up the page to be called along with the CGI parameter(s) it expects.

Cleaning up the links

At the start of this article I noted that I may have a page listing all my categories. In my original setup, this generated links using a construction like this:


  Shoes

Now my links are the much simpler:

Shoes

In my specific case, these links were generated within a [query] loop, but the approach is the same.

Note: the Strap demo supports SEO-friendly URLs out-of-the-box, and that it is included with the latest Interchange 5.10 release.


published by noreply@blogger.com (Marina Lohova) on 2016-01-26 14:17:00 in the "css" category

I was recently looking to build a full screen image and video gallery for our client Mission Blue. Something similar to the Google Maps interface you can see in the screenshot below:

After scouring the Internet to find a suitable jQuery plugin I finally decided on Supersized, Full screen background slideshow plugin for jQuery.

After downloading the library, include it on the page:

<link href="/wp-content/plugins/wp-supersized/theme/supersized.shutter.css?ver=4.2.2" id="supersized_theme_css-css" media="all" rel="stylesheet" type="text/css"></link>
<script src="/wp-includes/js/jquery/ui/effect.min.js?ver=1.11.4" type="text/javascript"></script>
<script src="/wp-content/plugins/wp-supersized/js/jquery.easing.min.js?ver=1.3" type="text/javascript"></script>
<script src="/wp-content/plugins/wp-supersized/js/jquery.easing.compatibility.js?ver=1.0" type="text/javascript"></script>
<script src="/wp-content/plugins/wp-supersized/js/jquery.animate-enhanced.min.js?ver=0.75" type="text/javascript"></script>
<script type='text/javascript' src='/wp-content/plugins/wp-supersized/js/supersized.3.2.7.min.js?ver=3.2.7'></script>

Basic functionality

Let's create a variable that will hold all the images in the slideshow:

var images = [];
images.push({
  type: 'IMAGE',
  image: 'img1.jpg',
  title: 'Image 1',
  thumb: 'img1_thumb.jpg',
  url: 'http://www.endpoint.com'
});
images.push({
  type: 'YOUTUBE',
  image: 'screenshot1.jpg',
  title: 'YouTube slide',
  videoid: 'abc12345678',
  thumb: 'screenshot1_thumb.jpg',
  url: 'https://www.youtube.com/watch?v=abc12345678'
});

Let's initialize Supersized:

jQuery.supersized({
  slideshow: 1,
  autoplay: 0,
  min_width: 0,
  min_height: 0,
  vertical_center: 1,
  horizontal_center: 1,
  fit_always: 0,
  fit_portrait: 1,
  fit_landscape: 0,
  slide_links: 'blank',
  thumb_links: 1,
  thumbnail_navigation: 1,
  slides: images,
  mouse_scrub: 0
});

Customizing the toolbar

<div id="thumb-tray" class="load-item">
  <div id="thumb-back"></div>
  <div id="thumb-forward"></div>
</div>
<div id="slidecaption"></div>

Customizing the screen image size

I didn't want to have the full screen image as it was a little overwhelming for the user. I wanted the black bars just like in the Google interface. Supersized allows for easy customization. This CSS did the trick:

#supersized, #supersized li {
  width: 70% !important;
  left: 0 !important;
  right: 0 !important;
  top: 1px !important;
  margin:auto;
}

Introducing video (YouTube) slides

First, I added the Youtube API:

<script type="text/javascript" src="https://www.youtube.com/iframe_api"></script>

Then I added a couple of CSS styles:

#supersized .player {
  margin: auto;
  display: block;
}

Finally, I went into the Supersized library source and modified it. To allow for the video slides to appear, I added the new condition and the slide type 'YOUTUBE'

base._renderSlide = function(loadPrev, options) {
  var linkTarget = base.options.new_window ? ' target="_blank"' : '';
  var imageLink = (base.options.slides[loadPrev].url) ? "href='" + base.options.slides[loadPrev].url + "'" : "";
  var slidePrev = base.el + ' li:eq(' + loadPrev + ')';
  var imgPrev = $('<img src="' + base.options.slides[loadPrev].image + '"/>');

  if (base.options.slides[loadPrev].type == 'YOUTUBE') {
    imgPrev.load(function () {
      var video = $('<div class="player" id="player'+ base.options.slides[loadPrev].videoid + '"></div>');
      video.appendTo(slidePrev);
      var player = new YT.Player('player' + base.options.slides[loadPrev].videoid, {
        height: 390,
        width: 640,
        videoId: base.options.slides[loadPrev].videoid
      });
    });// End Load
  }
  else {
    imgPrev.appendTo(slidePrev).wrap('<a ' + imageLink + linkTarget + '></a>').parent().parent().addClass('image-loading ' + options['class']);

    imgPrev.load(function () {
      $(this).data('origWidth', $(this).width()).data('origHeight', $(this).height());
      base.resizeNow();// Resize background image
    });// End Load
  }
};

Final Result

This is how gallery looks with the customizations:

This is what a video slide looks like:

Hope you found this writeup useful!


published by noreply@blogger.com (Marina Lohova) on 2016-01-22 16:11:00 in the "ImageMagick" category

Are you are running a WordPress blog, but secretly dying to have that Medium parallax blur effect? I recently implemented this, and would like to share it with you. By the way, while I was working on the article, the effect was removed from Medium, which only makes having one on the website more precious.

Let's assume that we have our custom theme class MyTheme. In functions.php:

class MyThemeBaseFunctions {
  public function __construct() {
    add_image_size('blurred', 1600, 1280, true);
    add_filter('wp_generate_attachment_metadata', array($this,'wp_blur_attachment_filter'));
  }
}

We added a custom image size blurred, and a callback wp_blur_attachment_filter to wp_generate_attachment_metadata. Here's where the magic happens.

Before that let's talk a little about ImageMagick, a powerful library for image processing that we will use to create the blurred effect. After some experimenting I figured that the image needed to be darkened, and then a regular blur should be applied with sigma=20. You can read more about these settings at ImageMagick Blur Usage. I used Gaussian Blur at first, but found the processing was extremely slow, and there wasn't much difference in the end result compared to other blur methods.

Now we are ready to write a blurring function:

public function wp_blur_attachment_filter($image_data) {
    if ( !isset($image_data['sizes']['blurred']) )
       return $image_data;
       $upload_dir = wp_upload_dir();
       $src = $upload_dir['path'] . '/' . $image_data['sizes']['large']['file'];
       $destination = $upload_dir['path'] . '/' . $image_data['sizes']['blurred']['file'];
       $imagick = new Imagick($src);
       $imagick->blurImage(0, 20, Imagick::CHANNEL_ALL);
       $imagick->modulateImage(75, 105, 100);
       $imagick->writeImage($destination);
       return $image_data;
 }

I darken the image:

$imagick->modulateImage(75, 105, 100);

And I blur the image:

$imagick->blurImage(0, 20, Imagick::CHANNEL_ALL);

Now we are able to use the custom image size in the template. Place the helper function in functions.php:

public static function non_blurred($src) {
  $url = get_site_url() . substr($src, strrpos($src, '/wp-content'));
  $post_ID = attachment_url_to_postid($url);
  list($url, $width, $height) = wp_get_attachment_image_src($post_ID, 'large');
  return $url;
}

public static function blurred($src) {
  $url = get_site_url() . substr($src, strrpos($src, '/wp-content'));
  $post_ID = attachment_url_to_postid($src);
  list($url, $width, $height) = wp_get_attachment_image_src($post_ID, 'blurred');
  return $url;
}

And now use it in the template like this:

<div class="blurImg">
  <div style="background-image: url('<?php echo MyTheme::non_blurred(get_theme_mod( 'header' )); ?>')"></div>
  <div style="background-image: url('<?php echo MyTheme::blurred(get_theme_mod('header')); ?>'); opacity: 0;" class="blur"></div>
</div>
<header></header>

Add CSS:

.blurImg {
  height: 440px;
  left: 0;
  position: relative;
  top: 0;
  width: 100%;
  z-index: -1;
}

.blurImg > div {
  background-position: center center;
  background-repeat: no-repeat;
  background-size: cover;
  height: 440px;
  position: fixed;
  width: 100%;
}

header {
  padding: 0 20px;
  position: absolute;
  top: 0;
  width: 100%;
  z-index: 1;
}

Add JavaScript magic sauce to gradually replace the non-blurred image with the blurred version as the user scrolls:

(function() {
    jQuery(window).scroll(function() {
      var H = 240;
      var oVal = jQuery(window).scrollTop() / H;
      jQuery(".blurImg .blur").css("opacity", oVal);
    });
  }).call(this);

Generating the blurred version can be very strenuous on the server. I would oftentimes receive the error PHP Fatal error: Maximum execution time of 30 seconds exceeded. There are ways to work around that. One way is to use a quicker method of blurring the image by shrinking it, blurring, and resizing it back. Another way is to use a background job, something like this:

add_action( 'add_attachment', array($this, 'wp_blur_attachment_filter') );
add_action( 'wp_blur_attachment_filter_hook', 'wp_blur_attachment_filter_callback');
function wp_blur_attachment_filter_callback($path) {
  $path_parts = pathinfo($path);
  $imagick = new Imagick($path);
  $imagick->blurImage(0, 20, Imagick::CHANNEL_ALL);
  $imagick->modulateImage(75, 105, 100);
  $destination = dirname($path) . "/" . $path_parts['filename'] . "_darken_blur." . $path_parts['extension'];
  $imagick->writeImage($destination);
}

public function wp_blur_attachment_filter($post_ID) {
  $path = get_attached_file( $post_ID );
  wp_schedule_single_event(time(), 'wp_blur_attachment_filter_hook', array($path));
}

Or better yet, use cloud image processing — I wrote about that here.

I hope you found this writeup useful!


published by noreply@blogger.com (David Christensen) on 2016-01-15 20:30:00 in the "database" category

With all the major changes and improvements to PostgreSQL's native replication system through the last few major releases, it's easy to forget that there can be benefits to having some of the tried and true functionalities from older PostgreSQL versions in place.

In particular, with the ease of setting up Hot Standby/Streaming Replication, it's easy to get replication going with almost no effort. Replication is great for redundancy, scaling, and backups, however it does not solve all potential data-loss problems; for best results when used in conjunction with Point-in-time Recovery (PITR) and the archiving features of PostgreSQL.

Background

We recently had a client experience a classic blunder with their database; mainly that of performing a manual UPDATE of the database without wrapping in a transaction block. The table in question was the main table in the application, and the client had done an unqualified UPDATE, unintentionally setting a specific field to a constant value instead of targetting the specific row they thought they were going for.

Fortunately, the client had backups. Unfortunately the backups themselves would not be enough; being a snapshot of the data earlier in the day, we would have lost all changes made throughout the day.

This resulted in a call to us to help out with the issue. We fortunately had information about precisely when the errant UPDATE took place, so we were able to use this information to help target a PITR-based restore.

The Approach

Since we did not want to lose other changes made in this database cluster either before or after this mistake, we came up with the following strategy which would let us keep the current state of the database but just recover the field in question:

  1. Create a parallel cluster for recovery.
  2. Load the WAL until just before the time of the event.
  3. Dump the table in question from the recovery cluster.
  4. Load the table in the main cluster with a different name.
  5. Use UPDATE FROM to update the field values for the table with their old values based on the table's Primary Key.

In practice, this worked out pretty well, though of course there were some issues that had to be worked around.

PostgreSQL's PITR relies on its WAL archiving mechanism combined with taking regular base backups of the data directory. As part of the archive setup, you choose the strategies (such as the frequency of the base backups) and ensure that you can recover individual WAL segment files when desired.

In order for the above strategy to work, you need hardware to run this on. The client had proposed their standby server which was definitely equipped to handle this and did not have much load. The client had initially suggested that we could break the replication, but we recommended against that, due to both having sufficient disk space and being able to avoid future work and risk by having to rebuild the replica after this stage.

We copied over the daily base backup into its own directory/mount point here, adjusted the recovery.conf file to point to the local WAL directory, and copied the necessary WAL files from the archive location to the pg_xlog directory of the new cluster. We also had to adjust a few parameters in the new cluster, most notably the "port" parameter to run the cluster on a different port. We also used the timestamp of the incident as a target for the recovery.conf's recovery_target_time setting. After starting up the new cluster and letting things process, we were able to dump the table in question and finish the recovery on the master.

Some issues did come up for us that we needed expert-level knowledge of the system, as well as having some good luck in the timing if the event. We had to locate several of the WAL files in the initial archive on the primary server due to some issues with the (inherited by us) configuration. Also due to the timing of the event and the amount of time it took to create the parallel cluster, we successfully were able to create the new instance before the next nightly base backup was run, which was fortunate, because it otherwise would have resulted in our inability to resolve this issue. (The client had things configured to keep only a single base backup around.)

Lessons Learned

With any issue, there is a takeaway, so what are those here?

  • Always use explicit transactions when manually modifying data, or modify your production environment's .psqlrc to addset AUTOCOMMIT off.
  • Not all data-loss situations can be fixed with replication alone?Point in Time Recovery is absolutelystillrelevant these days.
  • It helps to have a PostgreSQL expert on-hand day-or-night. End Point offers 24x7 PostgreSQL support, which you can engage by getting ahold of us here.

published by noreply@blogger.com (Jeff Boes) on 2016-01-15 14:30:00 in the "bash" category

Recently I managed to break the bash shell in an interesting and puzzling way. The initial symptoms were very frustrating: a workflow process we use here (creating a development camp) failed for me, but for no one else. That was at least a clue that it was me, not the workflow process.

Eventually, I narrowed down the culprit to the "grep" command (and that was more through luck than steadfast Sherlock-like detective work).

$ grep foo bar

grep: foo: No such file or directory

Eh? grep is mis-parsing the arguments! How does that happen?

So I began to study my bash environment. Eventually I came up with this fascinating little typo:

export GREP_OPTIONS='?color=auto'

That's supposed to be:

export GREP_OPTIONS='--color=auto'

but it got recorded in my .bashrc as a en-dash, not a double-dash. (My guess is that I cut-and-pasted this from a web page where someone over-helpfully "typeset" this command.)

Ironically, this typo is innocuous under Bash 3.x, but when you slot it into a Bash 4.x installation, all heck busts loose.


published by Eugenia on 2016-01-15 02:19:07 in the "General" category
Eugenia Loli-Queru

No1 point for health is sunshine, not food. You can eat all the shit in the world, but if you get a lot of UV, you can still be healthy as a horse, up to a certain age. For those of us who don’t, we need to be more selective. Here’s a rundown of my Pegan diet: a diet resembling a lot of both raw vegan but also a Paleo version with emphasis on some wild fish, rather than land meat.

– Regarding fats: it’s a medium to low fat diet. You can’t have it both ways (both high fat and high carb) without compromising health. I chose moderate-to-high carbs over moderate-to-low fat.
– If you live near the equator, get more vegan. If you live close to the Arctic, eat more fats, fish and less carbs. For us in the middle geographically, ratios are somewhere in the middle too.
– Beans and nuts/seeds are soaked before consumed. Brown rice is sprouted, otherwise white rice is used (once a week). Dairy must be fermented and preferably raw. Beans are eaten twice a week.
– Every breakfast is a smoothie of fruits, greens, kefir, and super-food powders.
– Every lunch (except Sunday’s) is a raw salad with a raw dressing. Monday’s salad also includes a boiled, pastured egg.
– Dinners are mentioned below. Dinners also include some salad, cheese, fruit, and additional veggies (depending on the recipe).
– There’s wild, low-mercury seafood 3 times a week, and pastured bone broth in soups, occasionally. Offal once a month, if labs show inadequacy in B12.
– Vitamins (unfortunately, our soils are depleted, so some supplementation is required). To be taken with a fatty meal: K2-Mk4 once a week, CoQ10 Ubiquinol twice a week, Magnesium thrice a week before sleep, Methyl-based B-complex once a week, D3 5000 IU twice a week in winter time only. DHA too, if not enough fatty fish is consumed.
– 75% of the whole diet is raw. Over 90% of it is vegan.

MONDAY (dinner)
beans (mostly lentils)

TUESDAY (dinner)
wild fish (mostly sardines) + veggies

WEDNESDAY (dinner)
Baked white potatoes + cheese + 2 boiled eggs

THURSDAY (dinner)
beans (any kind)

FRIDAY (dinner)
shellfish (mostly canned oysters) + sweet potatoes + veggies

SATURDAY
veggies + 2 eggs (lunch)
rice + Indian veggie curry (dinner)

SUNDAY
Fish: Alaskan wild salmon + veggies (lunch)
white potatoes + veggies (dinner)


published by noreply@blogger.com (Ben Witten) on 2016-01-13 21:28:00 in the "analytics" category

Google Analytics, a web analytics service offered by Google, is a very handy tool for understanding your audience. It allows you to understand where traffic comes from and what resonates with your audience, which has led to Google Analytics being the most widely used web analytics service on the internet. If you understand your website?s traffic, you then have the ability to focus your website and content to optimize engagement and growth.

With Google Analytics, you have the ability to see traffic from all channels. This will lead to clear insights, and will help you understand what?s working and what?s not.
  • Organic - traffic from search engines which is not paid for
  • Paid Search - visitors that clicked on one of your paid advertisements (also known as Pay-Per-Click or PPC)
  • Direct - visitors that typed your website address directly into the browser (includes bookmarks)
  • Social - traffic from sites that are considered to be social networking sites
  • Referral - visitors that arrived from 3rd party referrals
  • Email - visitors that are directed from an email
  • Display - visitors directed from video and display advertising

It will be helpful to walk through an example. Say you launch an email marketing campaign, and want to understand how your audience responded to your content. First, you can check how many people clicked the ad to come to your website. From there, you can see how they navigated the page. Who came to the webpage? Did they take the actions you were hoping they would take? How long did they spend viewing the page? Where did they click? Did this click lead to a conversion/sale?

Prior to coming to End Point, I was working as an Analytic Strategist at a digital media agency. One of my biggest clients was luxury jewelry company Tiffany & Co. My responsibilities included analyzing seasonal and promotional trends to develop forecasts, broken out by channel. I would also evaluate the marketing effectiveness of initiatives by keeping a close eye on the user experience and navigational behavior, and provide recommendations throughout the customer journey.

Many jewelry promotions are seasonal, so we were constantly making changes to optimize the page and ensure traffic was navigating the website as we hoped. I would make sure to keep in mind the 4 P?s - Product, Price, Place, and Promotion. I wanted to ensure that the layout of your site, price-point, and specials were always in line with what the audience is most looking for during the season.

This experience of understanding client goals, and helping them achieve those goals through the success of their websites, has proven to be a valuable skill since joining End Point?s team. Beyond having the ability to help clients with their websites, this knowledge has also been helpful internally. I keep a close eye on where our website traffic is coming from, including seeing which blog posts resonate with different audiences. I have made recommendations for adjustments to poor-performing pages, and taken time to analyze pages with high bounce rates, to see whether the bounce rates are due to the page not being well-constructed or due to traffic not being properly directed. I also take note of any industry changes, and make website adjustments to accommodate those changes. For example, End Point has a plethora of skill in many programming languages, and if we see one language gaining popularity we will make a point to highlight it more prominently.

I look forward to continuing to use Google Analytics to strengthen our company websites and to help our clients strengthen theirs. Please don't hesitate to reach out at ask@endpoint.com if you would like to learn more.


published by noreply@blogger.com (Greg Sabino Mullane) on 2016-01-13 15:24:00 in the "mediawiki" category

Keeping your MediaWiki site up to date with the latest version is, like many sysadmin tasks, a never-ending chore. In a previous article I covered how to upgrade minor revisions of MediaWiki with patches. In this one, I'll cover my solution to doing a "major" upgrade to MediaWiki. While the official upgrade instructions are good, they don't cover everything.

MediaWiki, like Postgres, uses a three-section version number in which the first two numbers combined give the major version, and the number on the end the revision of that branch. Thus, version 1.26.2 is the third revision (0, then 1, then 2) of the 1.26 version of MediaWiki. Moving from one major version to another (for example 1.25 to 1.26) is a larger undertaking than updating the revision, as it involves significant software changes, whereas a minor update (in which only the revision changes) simply provides bug fixes.

The first step to a major MediaWiki upgrade is to try it on a cloned, test version of your wiki. See this article on how to make such a clone. Then run through the steps below to find any problems that may crop up. When done, run through again, but this time on the actual live site. For this article, we will use MediaWiki installed in ~intranet/htdocs/mediawiki, and going from version 1.25.3 to 1.26.2

Preparation

Before making any changes, make sure everything is up to date in git. You do have your MediaWiki site controlled by git, right? If not, go do so right now. Then check you are on the main branch and have no outstanding changes. It should look like this:

$ cd ~/htdocs/mediawiki
$ git status
# On branch master
nothing to commit, working directory clean

Download

Time to grab the new major version. Always get the latest revision in the current branch. For this example, we want the highest in the 1.26 branch, which as of this writing is 1.26.2. You can always find a prominent link on mediawiki.org. Make sure you grab both the tarball (tar.gz) and the signature (.tar.gz.sig) file, then use gnupg to verify it:

$ wget https://releases.wikimedia.org/mediawiki/1.26/mediawiki-1.26.2.tar.gz
$ wget https://releases.wikimedia.org/mediawiki/1.26/mediawiki-1.26.2.tar.gz.sig
$ gpg mediawiki-1.26.2.tar.gz.sig 
gpg: assuming signed data in `mediawiki-1.26.2.tar.gz'
gpg: Signature made Sun 20 Dec 2015 08:13:14 PM EST using RSA key ID 23107F8A
gpg: please do a --check-trustdb
gpg: Good signature from "Chad Horohoe <chad@wikimedia.org>"
gpg:                 aka "keybase.io/demon <demon@keybase.io>"
gpg:                 aka "Chad Horohoe (Personal e-mail) <innocentkiller@gmail.com>"
gpg:                 aka "Chad Horohoe (Alias for existing email) <chadh@wikimedia.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 41B2 ABE8 17AD D3E5 2BDA  946F 72BC 1C5D 2310 7F8A

Copy the tarball to your server, and untar it in the same base directory as your mediawiki installation:

$ cd ~/htdocs
$ tar xvfz ~/mediawiki-1.26.2.tar.gz

Copy files

Copy the LocalSettings.php file over, as well as any custom images (e.g. the logo, which I like to keep nice and visible at the top level):

$ cp mediawiki/LocalSettings.php mediawiki-1.26.2/
$ cp mediawiki/Wiki_logo.png mediawiki-1.26.2/

Setup the images directory. The tarball comes with a dummy directory containing a few unimportant files. We want to replace that with our existing one. I keep the images directory a level up from the actual mediawiki directory, and symlink it in. This allows for easy testing and upgrades:

$ cd ~/htdocs/mediawiki-1.26.2
$ rm -fr images/ ## Careful, make sure you are in the right directory! :)
$ ln -s ../images/ .

Copy extensions

Now it is time to copy over the extensions. MediaWiki bundles a number of extensions in the tarball, as they are considered "core" extensions. We do not want to overwrite these with our old versions. We do want to copy any extensions that exist in our old mediawiki directory, yet not in our newly created one. To help keep things straight and reduce typing, let's make some symlinks for the existing (old) MediaWiki and for the current (new) MediaWiki, naming them "aa" and "bb" respectively. Then we use "diff" to help us copy the right extensions over:

$ cd ~/htdocs
$ ln -s mediawiki aa
$ ln -s mediawiki-1.26.2 bb
## Visually check things over with:
$ diff aa/extensions bb/extensions | grep 'Only in aa' | awk '{print $4}' | more
## Do the copying:
$ diff aa/extensions bb/extensions | grep 'Only in aa' | awk '{print $4}' | xargs -iZ cp -r aa/extensions/Z bb/extensions/Z

Extensions may not be the only way you have modified your installation. There could be skins, custom scripts, etc. Copy these over now, being sure to only copy what is truly still needed. Here's one way to check on the differences:

$ cd ~/htdocs
$ diff -r aa bb | grep 'Only in aa' | more

Check into git

Now that everything is copied over, we can check the 1.26.2 changes into git. To do so, we will move the git directory from the old directory to the new one. Remember to let anyone who might be developing in that directory know what you are doing first!

$ mv aa/.git bb/
## Don't forget this important file:
$ mv aa/.gitignore bb/
$ cd mediawiki-1.26.2
$ git add .
$ git commit -a -m "Upgrade to version 1.26.2"
$ git status
# On branch master
nothing to commit, working directory clean

Extension modifications

This is a good time to make any extension changes that are needed for the new version. These should have been revealed in the first round, using the cloned test wiki. In our case, we needed an updated and locally hacked version of the Auth_remoteuser extension:

$ cd ~/htdocs/mediawiki-1.26.2/extensions
$ rm -fr Auth_remoteuser/
$ tar xvfz ~/Auth_remoteuser.tgz
$ git add Auth_remoteuser
$ git commit -a -m "New version of Auth_remoteuser extension, with custom fix for wpPassword problem"

Core modifications

One of the trickiest part of major upgrades is the fact that all the files are simply replaced. Normally not a problem, but what if you are in the habit of modifying the core files because sometimes extensions cannot do what you want? My solution is to tag the changes prominently - using a PHP comment that contains the string "END POINT". This makes it easy to generate a list of files that may need the local changes applied again. After using "git log" to find the commit ID of the 1.26.2 changes (message was "Upgrade to version 1.26.2"), we can grep for the unique string and figure out which files to examine:

$ git log 1a83a996b9d00444302683fb6de6e86c4f4006e7 -1 -p | grep -E 'diff|END POINT' | grep -B1 END
diff --git a/includes/mail/EmailNotification.php b/includes/mail/EmailNotification.php
-        // END POINT CHANGE: ignore the watchlist timestamp when sending notifications
-        // END POINT CHANGE: send diffs in the emails
diff --git a/includes/search/SearchEngine.php b/includes/search/SearchEngine.php
-       // END POINT CHANGE: Remove common domain suffixes

At that point, manually edit both the new and old version of the files and make the needed changes. After that, remember to commit all your changes into git.

Final changes

Time to make the final change, and move the live site over. The goal is to minimize the downtime, so we will move the directories around and run the update.php script on one line. This is an excellent time to notify anyone who may be using the wiki that there may be a few bumps.

## Inform people the upgrade is coming, then:
$ mv mediawiki old_mediawiki; mv mediawiki-1.26.2 mediawiki; cd mediawiki; php maintenance/update.php --quick
$ rm ~/htdocs/aa ~/htdocs/bb

Testing

Hopefully everything works! Time to do some testing. First, visit your wiki's Special:Version page and make sure it says 1.26.2 (or whatever version you just installed). Next, test that most things are still working by:

  • Logging in, and...
  • Editing a page, then...
  • Upload an image, plus...
  • Test all your extensions.

For that last bullet, having an extension testing page is very handy. This is simply an unused page on the wiki that tries to utilize as many active extensions as possible, so that reloading the page should quickly allow a tally of working and non-working extensions. I like to give each extension a header with its name, a text description of what should be seen, and then the actual extension in action.

That's the end of the major upgrade for MediaWiki! Hopefully in the future the upgrade process will be better designed (I have ideas on that - but that's the topic of another article). One final check you can do is to open a screen and tail -f the httpd error log for your site. After the upgrade, this is a helpful way to spot any issues as they come up.


published by noreply@blogger.com (Marina Lohova) on 2016-01-12 17:44:00 in the "apache" category

I wanted to install WordPress locally for my blog (about programming!), but using MAMP, XAMP or even Vagrant for this seemed overkill. I wanted a light setup. PHP and Apache are already integrated into Mac OS X, so why not use them? I wanted to deploy the app to Heroku, so that was another thing, since Heroku only provides PostgreSQL, not MySQL, out of the box. I'd like to share my research on how I did it.

WordPress with Heroku support

I found this handy WordPress template with built-in Heroku support. It has everything one needs to run WordPress on Heroku: PostgreSQL for WordPress (because MySQL on Heroku is a paid service), Amazon S3 and Cloudfront for your uploads since Heroku has an ephemeral file system, WP Sendgrid to send emails and WordPress HTTPS. Check out a copy with this command:

git clone git://github.com/mhoofman/wordpress-heroku.git

Let's run the project locally first because a file cannot be written to Heroku's file system, and updating and installing plugins or themes should be done locally anyways and then pushed to Heroku. I'm using PhpStorm for my PHP development.

Configuring Apache

mkdir -p ~/Sites
echo "<html><body><h1>My site works</h1></body></html>" > ~/Sites/index.html.en

Enable PHP support:

sudo vi /etc/apache2/httpd.conf

Uncomment the following lines to look like this:

LoadModule php5_module libexec/apache2/libphp5.so
LoadModule userdir_module libexec/apache2/mod_userdir.so
Include /private/etc/apache2/extra/httpd-userdir.conf

Save and exit. Open the following file:

sudo vi /etc/apache2/extra/httpd-userdir.conf

Uncomment the following line to look like this:

Include /private/etc/apache2/users/*.conf

Save and exit. Open or create:

sudo vi /etc/apache2/users/~YOURUSERNAME.conf

Type the following in there:

<Directory "/Users/~YOURUSERNAME/Sites/">
    AddLanguage en .en
    LanguagePriority en fr de
    ForceLanguagePriority Fallback
    Options Indexes MultiViews
    AllowOverride None
    Order allow,deny
    Allow from localhost
    Require all granted
</Directory>

Restart Apache with:

sudo apachectl restart

Go to http://localhost/~YOURUSER/wordpress-heroku/ and enjoy the results of your work! OK, not so fast! There are more steps to make it happen ;)

Enabling PostgreSQL for PHP

Your PHP installation appears to be missing the PostgreSQL extension which is required by WordPress with PG4WP.

Here is a handy script to fix this problem Install PHP PGSQL extensions on Mac OS X Yosemite (change PHP_VER with your PHP version).

Creating the database

Hit http://localhost/~YOURUSER/blog-heroku/wp-admin/install.php

Error establishing a database connection

The template we are using is tailored for the deployment to Heroku, which means wp-config.php takes its values from the DATABASE_URL environment variable that Heroku config creates in local environment pointing to the database source on Heroku servers.

pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start
> createdb wordpress
> psql wordpress
CREATE USER wordpress WITH PASSWORD 'wordpress';
GRANT ALL PRIVILEGES ON DATABASE wordpress to wordpress; 

In wp-config.php, edit as follows. Make sure it matches the database and user that you just created.

$db = parse_url($_ENV["DATABASE_URL"] ? $_ENV["DATABASE_URL"] : "postgres://wordpress:wordpress@localhost:5432/wordpress");

Now 5 hours later, you are completely ready for the famous 5-min install ;D. Go to http://localhost/~YOURUSER/blog-heroku/wp-admin/install.php

Uploading the custom theme/plugin

What to do next? Of course, upload a custom theme or plugin.

Unable to create directory wp-content/uploads/2015/08. Is its parent directory writable by the server?
$ cd ~/Sites/THESITE
$ sudo chown -R _www wordpress
$ sudo chmod -R g+w wordpress

If you encounter an error asking you for FTP credentials in order to do this:

To perform the requested action, WordPress needs to access your web server.
Please enter your FTP credentials to proceed.

If you do not remember your credentials, you should contact your web host.

The problem is that Apache HTTP Server in Mac OS X runs under the user account _www which belongs to the group _www. To allow WordPress to perform operations with Apache, one way to do this is to change the owner of the wordpress directory and its contents to _www. Keep the group as staff, a group to which your user account belongs and give write permissions to the group.

$ cd ~/Sites/THESITE
$ sudo chown -R _www wordpress
$ sudo chmod -R g+w wordpress

This way, no file nor directory is world-writable.

Remember to commit your plugins/themes because due to the nature of Heroku all of the files will be overwritten there if uncommitted or not in the database, effectively wiping out all of your changes at each server restart if you do them on the server.

I installed this pretty theme for myself called Literatum -- just bragging.

Deployment to Heroku

One of the most exciting last steps. This will make your blog visible to the world! Commit the changes:

rm -rf .git
git init
git add .
git commit -m "Initial commit"

Create Heroku app:

$ cd wordpress-heroku
$ heroku create
$ heroku addons:create heroku-postgresql
$ heroku pg:promote HEROKU_POSTGRESQL_INSTANCE
$ heroku addons:create sendgrid:starter

Your first deployment!

git push heroku master

Go to http://YOURAPP.herokuapp.com/wp-admin/install.php and run the famous 5-minute setup again, activate all the plugins and the chosen custom theme aaand... You are done!

Hope you will find this write-up useful and it will help you create your blog on the web!


published by noreply@blogger.com (Peter Hankiewicz) on 2016-01-12 00:30:00 in the "e-commerce" category

Introduction

Recently I was working with Magento 1.9.1, trying to get a list of dropdown attribute options sorted by the position property. However there is a known bug in Magento 1.9.1, where the position property is not respected.

I looked for a patch to fix this issue, however there was no official patch, and none of the available community fixes were good enough. So again, I needed to fix it by myself.

Tip! If you know how to apply a patch file it is here. If not, please continue.

Part 1

We need to overwrite some Magento core code unfortunately. The good thing is that there is a cool way of doing this in Magento so we don't need to overwrite the files directly, we need to create a local copy.

Copy app/code/Core/Mage/Catalog/Model/Resource/Product/Type/Configurable/Attribute/Collection.php file to app/code/local/Mage/Catalog/Model/Resource/Product/Type/Configurable/Attribute/Collection.php. You need to create the whole directory structure. If you use Unix system it is simple as: (running from Magento root)

mkdir -p app/code/local/Mage/Catalog/Model/Resource/Product/Type/Configurable/Attribute/
cp app/code/Core/Mage/Catalog/Model/Resource/Product/Type/Configurable/Attribute/Collection.php app/code/local/Mage/Catalog/Model/Resource/Product/Type/Configurable/Attribute/Collection.php

Part 2

Fill a local file with a content from this source: https://gist.github.com/peter-hank/c917394ea9f1171ddeb8

Notes

After these changes it will work as expected and attribute options will be sorted by a position set.

This fix should work for any Magento 1.9.* version.


published by noreply@blogger.com (Jon Jensen) on 2016-01-08 01:26:00 in the "sysadmin" category

Most Unix filesystems, including all the common Linux ones, are fully case-sensitive, meaning you can have two files in the same directory that differ only by case:

  • a-very-nice-image.png
  • a-VERY-nice-image.png

However, this is not true on Windows and Mac OS X. They will preserve your chosen capitalization, but each file name must be unique regardless of the case.

I don't know of situations where it would be wise to have such conflicting mixed-case files even on Linux where it works fine. But for various reasons this can happen in the messy real world. If you then send those files to someone on Windows or Mac OS X in a zip file, or via Git version control, they're going to be confused.

When unzipping, usually the last file to be extracted will overwrite the earlier one with the nearly-same name. So a file that is perhaps important will just be mysteriously gone.

When pulling in files with Git, the same thing happens, but you also immediately have an unclean working copy that Git will tell you about:

$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   a-VERY-nice-image.png

To fix that, it's easiest to go back to the (presumably) Linux system where the conflicting files were created, and remove one of them. Then on the Windows or Mac OS X side, remove both files (with normal rm) and then git pull or at least git checkout . to write the surviving file to disk.

If you have a large set of files with several conflicting names scattered throughout and want an easy way to see them all at once so you can clean them up, there's a simple Unix pipeline you can run at the shell:

find . | sort -f | uniq -Di

That is a variation of one of the suggestions from a Stack Exchange discussion, but changed to show both file names with the uniq -D option that is only in the GNU version of uniq. Other versions have uniq -d which is almost as good but shows just one of the pair of other-than-case-duplicated names.

Happy new year, and I hope you never have need for any of this advice. :)


published by noreply@blogger.com (Ben Witten) on 2016-01-04 16:35:00 in the "Liquid Galaxy" category

On December 15th, End Point presented the Liquid Galaxy at New York Tech Meetup, the largest meetup group in the world. End Point Co-Founder/President Ben Goldstein and I gave a five minute overview and then answered questions from the audience for another five minutes. At the conclusion of the presentation portion of the event, attendees went to the after-party, where we had a full Liquid Galaxy system set up for attendees to experience for themselves.

I opened up the presentation by speaking about where the Liquid Galaxy is being utilized (corporate offices, museums, and commercial enterprises around the world), and about the setup of the system (large paneled HDTVs, a touchscreen, 3D joystick, and 3 rack-mount servers). I talked about how the Liquid Galaxy was originally a tool designed to view Google Earth in an immersive setting, but from there End Point has expanded the Liquid Galaxy?s capabilities to make it a tool for educational and commercial use, marketing, sales, and research.

I went on to explain how End Point?s Content Management System gives users the ability to tell their story and show information and data points on the system. You easily can include images, panoramic videos, KML and other sorts of overlays. By using your company?s GIS (Geographic Information System) data, you have an entirely new way of visualizing data, 3D models, and presenting demographic and other critical information to your target audience, while still providing the excitement of the immersive experience. This includes things like population density, property value data, bar graphs and more.

I closed by explaining how each of the screens is being calculated to match the physical angle of the display. When those displays are put in an arch around the viewers, the sense of immersion and depth comes alive to the point of giving a real sense of flight and place.

From here, Ben G. touched on some of the more technical elements of the Liquid Galaxy. He explained that the main applications running on the Liquid Galaxy are Google Earth, Street View, a local panoramic still image and video viewer, and End Point?s Content Management System.

Ben G. discussed remote support, and how when End Point deploys a new system, it monitors the system in a variety of ways, one of which is with screen captures. Screenshots are generated so End Point can see remotely what its customers are seeing on the ground.

Ben G. finished the talk by explaining the architecture of the typical Liquid Galaxy systems. The physical architecture has a headnode and a number of display nodes. The head node is plugged into the network and it serves as a NATing router for the display nodes which run on an internal LAN. The display nodes are blank machines that netboot from an ISO on the headnode. This makes it easy to remotely upgrade a system. End Point support personnel generate a new ISO on the headnode and restart the display nodes remotely.

Q&A Session

After our presentation, Ben G. and I were asked questions about the Liquid Galaxy. Some highlights:
  • "How does the Liquid Galaxy fit in the context of AR and VR?" Ben G. explained that one thing that distinguishes the Liquid Galaxy is the communal experience. Often with VR, the user is having an individual experience. We foresee combining with the LG with a VR headset, so that you will be able to display what you?re seeing on the headset right onto the LG.
  • "How is End Point engaging content creators for the platform?" There is a big, vibrant community that we want to integrate with. For a simple but useful example, cell phones can generate panoramas, which can be dropped into the CMS.
  • "Can you use the Liquid Galaxy to map an emergency before and after the event?" Ben G: Absolutely. We think the LG is an excellent display platform to show events as they are occurring. One of the things that?s great is the Liquid Galaxy takes in your peripheral vision, so if you have a spherical or panoramic collection tool then you can really see it on the system.
  • "How close to live data can you get?" The Liquid Galaxy is primarily being run with Google Earth at the moment. You can incorporate panoramic images, depending on what your data source is. If you?d like to incorporate a panoramic webcam, your info can display instantaneously on the LG.

During the after-party, attendees had the opportunity to free fly on the Liquid Galaxy, to view presentations, and to navigate around on Street View. Six members of End Point?s team were present, so attendees also had the opportunity to ask questions and to learn more about the system.