All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors, or our affiliates.
  Add to My Yahoo!  Subscribe with Bloglines  Subscribe in NewsGator Online

published by noreply@blogger.com (Greg Sabino Mullane) on 2014-12-22 19:07:00 in the "Bucardo" category
Bucardo is one of the trigger-based replication systems for Postgres (others include Slony and Londiste). All of these not only use triggers to gather information on what has changed, but they also disable triggers when copying things to remote databases. They do this to ensure that only the data itself gets copied, in as fast as manner as possible. This also has the effect of disabling foreign keys, which Postgres implements by use of triggers on the underlying tables. There are times, however, when you need a trigger on a target to fire (such as data masking). Here are four approaches to working around the disabling of triggers. The first two solutions will work with any replication system, but the third and fourth are specific to Bucardo.

First, let's understand how the triggers get disabled. A long time ago (Postgres 8.2 and older), triggers had to be disabled by direct changes to the system catalogs. Luckily, those days are over, and now this is done by issuing this command before copying any data:

SET session_replication_role = 'replica';

This prevents all normal triggers and rules from being activated. There are times, however, when you want certain triggers (or their effects) to execute during replication.

Let's use a simple hypothetical to illustrate all of these solutions. We will start with the Postgres built-in pgbench utility, The initialize option (-i) can be used to create and populate some tables:

$ createdb btest1
$ pgbench -i btest1
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.16 s, remaining 0.00 s).
vacuum...
set primary keys...
done.

We want to replicate all four of the tables pgbench just created. Bucardo requires that a table have a primary key or a unique index to be replicated, so we will need to make an immediate adjustment to the pgbench_history table:

$ psql btest1 -c 'ALTER TABLE pgbench_history ADD hid SERIAL PRIMARY KEY'
ALTER TABLE

Now to make things a little more interesting. Let's add a new column to the pgbench_accounts table named "phone", which will hold the account owner's phone number. As this is confidential information, we do not want it to be available - except on the source database! For this example, database btest1 will be the source, and database btest2 will be the target.

$ psql btest1 -c 'ALTER TABLE pgbench_accounts ADD phone TEXT'
ALTER TABLE
$ createdb btest2 --template=btest1

To prevent the phone number from being revealed to anyone querying btest2, a trigger and supporting function is used to change the phone number to always display the word 'private'. Here is what they look like.

btest2=# CREATE OR REPLACE FUNCTION elide_phone()
  RETURNS TRIGGER
  LANGUAGE plpgsql
  AS $bc$
BEGIN
  NEW.phone = 'private';
  RETURN NEW;
END;
$bc$;
CREATE FUNCTION

btest2=# CREATE TRIGGER elide_phone
  BEFORE INSERT OR UPDATE
  ON pgbench_accounts
  FOR EACH ROW
  EXECUTE PROCEDURE elide_phone();
CREATE TRIGGER

Now that everything is setup, we can install Bucardo and teach it how to replicate those tables:

$ bucardo install 
This will install the bucardo database into ...
...
Installation is now complete.

$ bucardo add db A,B dbname=btest1,btest2
Added databases "A","B"

$ bucardo add sync pgb dbs=A,B tables=all
Added sync "pgb"
Created a new relgroup named "pgb"
Created a new dbgroup named "pgb"
  Added table "public.pgbench_accounts"
  Added table "public.pgbench_branches"
  Added table "public.pgbench_history"
  Added table "public.pgbench_tellers"

$ bucardo start

A demonstration of the new trigger is now in order. On the database btest2, we will update a few rows and attempt to set the phone number. However, our new trigger will overwrite our changes:

$ psql btest2 -c "update pgbench_accounts set abalance=123, phone='867-5309' where aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone  
-----+----------+---------
   1 |      123 | private
   2 |      123 | private
   3 |      123 | private

So, all is as we expected: any changes made to this table have the phone number changed. Let's see what happens when the changes are done via Bucardo replication. Note that we are updating btest1 but querying btest2:

$ psql btest1 -c "update pgbench_accounts set abalance=99, phone='867-5309' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       99 | 867-5309
   2 |       99 | 867-5309
   3 |       99 | 867-5309

As you can see, our privacy safeguard is gone, as Bucardo disables the trigger on btest2 before making the changes. So what can we do? There are four solutions: set the trigger as ALWAYS, set the trigger as REPLICA, use Bucardo's customcode feature, or use Bucardo's customcols feature.

Solution one: ALWAYS trigger

The easiest way is to simply mark the trigger as ALWAYS, which means that it will always fire, regardless of what session_replication_role is set to. This is the best solution for most problems of this sort. Changing the trigger requires an ALTER TABLE command. Once done, psql will show you the new state of the trigger as well:

btest2=# d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers 
----------+---------------+-----------
 aid      | integer       | not null
 bid      | integer       | 
 abalance | integer       | 
 filler   | character(84) | 
 phone    | text          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Triggers:
    elide_phone BEFORE INSERT OR UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE elide_phone()

btest2=# ALTER TABLE pgbench_accounts ENABLE ALWAYS TRIGGER elide_phone;
ALTER TABLE

btest2=# d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers 
----------+---------------+-----------
 aid      | integer       | not null
 bid      | integer       | 
 abalance | integer       | 
 filler   | character(84) | 
 phone    | text          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Triggers firing always:
    elide_phone BEFORE INSERT OR UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE elide_phone()

That is some ugly syntax for changing the triggers, eh? (To restore a trigger to its default state, you would simply leave out the ALWAYS clause, so it becomes ALTER TABLE pgbench_accounts ENABLE TRIGGER elide_phone). Time to verify that the ALWAYS trigger fires even when Bucardo is updating the table:

$ psql btest1 -c "update pgbench_accounts set abalance=11, phone='555-2368' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       11 | private
   2 |       11 | private
   3 |       11 | private

Solution two: REPLICA trigger

Trigger-based replication solutions, you may recall from above, issue this command: SET session_replication_role = 'replica'. What this means is that all rules and triggers that are not of type replica are skipped (with the exception of always triggers of course). Thus, another solution is to set the triggers you want to fire to be of type "replica". Once you do this, however, the triggers will NOT fire in ordinary use - so be careful. Let's see it in action:

btest2=# ALTER TABLE pgbench_accounts ENABLE REPLICA TRIGGER elide_phone;
ALTER TABLE

btest2=# d pgbench_accounts
   Table "public.pgbench_accounts"
  Column  |     Type      | Modifiers 
----------+---------------+-----------
 aid      | integer       | not null
 bid      | integer       | 
 abalance | integer       | 
 filler   | character(84) | 
 phone    | text          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Triggers firing on replica only:
    elide_phone BEFORE INSERT OR UPDATE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE elide_phone()

As before, we can test it out and verify the trigger is firing:

$ psql btest1 -c "update pgbench_accounts set abalance=22, phone='664-7665' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       22 | private
   2 |       22 | private
   3 |       22 | private

Solution three: Bucardo customcode

Bucardo supports a number of hooks into the replication process. These are called "customcodes" and consist of Perl code that is invoked by Bucardo. To solve the problem at hand, we will create some code for the "code_before_trigger_enable" hook - in other words, right after the actual data copying is performed. To create the customcode, we write the actual code to a text file, then do this:

$ bucardo add code nophone whenrun=before_trigger_enable sync=pgb src_code=./nophone.pl

This creates a new customcode named "nophone" that contains the code inside the local file "nophone.pl". It runs after the replication, but before the triggers are re-enabled. It is associated with the sync named "pgb". The content of the file looks like this:

my $info = shift;

return if ! exists $info->{rows};

my $schema = 'public';
my $table = 'pgbench_accounts';
my $rows = $info->{rows};
if (exists $rows->{$schema} and exists $rows->{$schema}{$table}) {
  my $dbh = $info->{dbh}{B};
  my $SQL = "UPDATE $schema.$table SET phone=? "
    . "WHERE aid = ? AND phone <> ?";
  my $sth = $dbh->prepare($SQL);
  my $string = 'private';
  for my $pk (keys %{ $rows->{$schema}{$table} }) {
    $sth->execute($string, $pk, $string);
  }
}
return;

Every customcode is passed a hashref of information from Bucardo. One of the things passed in a list of changed rows. At the top, we see that we exit right away (via return, as the customcodes become Perl subroutines) if there are no rows this round. Then we check that something has changed for the pgbench_accounts tables. We grab the database handle, also passed to the subroutine. Note that this is actually a DBIx::Safe handle, not a direct DBI handle. The difference is that certain operations, such as commit, are not allowed.

Once we have the handle, we walk through all the rows that have changed, and set the phone to something safe. The above code is a good approach, but we can make the UPDATE much smarter because we are using a modern Postgres which supports ANY, and a modern DBD::Pg that supports passing Perl arrays in and out. Once we combine those two, we can move the execute() out of the loop into a single call like so:

...
if (exists $rows->{$schema} and exists $rows->{$schema}{$table}) {
  my $dbh = $info->{dbh}{B};
  my $SQL = "UPDATE $schema.$table SET phone=?"
    . "WHERE aid = ANY(?) AND phone <> ?";
  my $sth = $dbh->prepare($SQL);
  my $string = 'private';
  $sth->execute($string, [ keys %{ $rows->{$schema}{$table} } ], $string);
}

Note that this solution requires Bucardo version 5.3.0 or better. Let's verify it:

$ psql btest1 -c "update pgbench_accounts set abalance=33, phone='588-2300' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       33 | private
   2 |       33 | private
   3 |       33 | private

Solution four: Bucardo customcols

The final way to keep the information in that column masked is to use Bucardo's 'customcols' feature. This allows rewriting of the command that grabs rows from the source databases. Bucardo uses COPY to grab rows from a source, DELETE to remove the rows if they exist on the target, and another COPY to add the rows to the target tables. Postgres supports adding a SELECT clause to a COPY command, as we will see below. To hide the values of the phone column using the customcols feature, we simply do:

$ bucardo add customcols public.pgbench_accounts "select aid,bid,abalance,filler,'private' as phone" db=B sync=pgb
New columns for public.pgbench_accounts: "select aid,bid,abalance,filler,'private' as phone" (for database B) (for sync pgb)

The list of columns must be the same as in the original table, but we can modify things! So rather than Bucardo doing this:

COPY (SELECT * FROM public.pgbench_accounts WHERE aid IN (1,2,3)) TO STDOUT

Bucardo will instead do this thanks to our customcols:

COPY (SELECT aid,bid,abalance,filler,'private' as phone FROM public.pgbench_accounts WHERE aid IN (1,2,3)) TO STDOUT

Let's verify it:

$ psql btest1 -c "update pgbench_accounts set abalance=44, phone='736-5000' WHERE aid <= 3"
UPDATE 3

$ psql btest2 -c 'select aid,abalance,phone from pgbench_accounts order by aid limit 3'
 aid | abalance |  phone   
-----+----------+----------
   1 |       44 | private
   2 |       44 | private
   3 |       44 | private

Those are the four approaches to firing (or emulating) triggers when using replication. Which one you choose depends on what exactly your trigger does, but overall, the best solution is probably the 'trigger ALWAYS', followed by 'Bucardo customcols'. If you have another solution, or some problem that is not covered by the above, please let me know in the comments.


published by noreply@blogger.com (Brian Gadoury) on 2014-12-05 21:43:00 in the "AngularJS" category

Seeing the proposed line-up for the 2014 hack.summit() virtual conference was the grown-up equivalent of seeing the line-up for some of the first Lollapalooza events. It was definitely an "All those people I want to see, and all in one place? *head asplode*" moments.

So, what is this conference with the incredibly nerdy name? In short, it's a selection of industry leading speakers presenting all on-line and streamed live. The "registration fee" was actually a choice between mentioning the conference on a few social media platforms, or making a donation to one of a number of programming non-profits. Seeing as I don't tweet, I made a donation, then signed in (using OAuth) via my Google+ account. It was a delightfully frictionless process.

The hack.summit() conference ran December 1st through December 4th, but I was only able to "attend" the last two days. Luckily for me, all of the live-streamed presentations are also available afterwards on the hacksummit site. They feel a little hidden away in the small menu in the upper left corner, but they're all there, available as YouTube videos.

So, why is was hack.summit() worth your time? It's got an amazing collection of very accomplished developers, thought leaders and experienced big cheeses of some companies that do some pretty impressive work. During the live event, the Crowdcast platform provided a great delivery mechanism for the streaming videos, as well as admin-created polls, a light-weight chat feature, and audience-voted questions for the presenters. Hack.summit() founder, Ed Roman, did a great job MC-ing the entire event, too. (And to whoever figured out how to game the voting system at a conference named hack.summit(), well played you rogue.)

In closing, I strongly recommend you do a few things: Go sign up right now to gain access to the presentation videos. Commit some time (make a deal with yourself, get approval to do a group viewing at work, whatever) to watch as many presentations as you can. Lastly, set a calendar reminder to keep an eye out for the hack.summit() 2015 that will hopefully happen.


published by noreply@blogger.com (Kirk Harr) on 2014-12-03 08:00:00

Mezzanine (http://mezzanine.jupo.org/) is a powerful piece of software written for the Django Framework in Python that functions like a structured content management system similar to Drupal, WordPress, and others. Built on top of this CMS structure is a module which adds features for Ecommerce which is known as Cartridge (http://cartridge.jupo.org/).

Installing Cartridge/Mezzanine

Installing the Mezzanine CMS system from scratch can be accomplished in two methods, either by using the pip python package manager, or you can clone the current source from the git repository from the software maintainers. If you were planning to modify either Mezzanine or Cartridge to customize the setup for your own needs the latter would likely be preferable as you could then easily begin creating custom branches off the original source to track the customization work. However for this example I will show the pip method of installation:

pip install -U cartridge
Once installed, there is a mezzanine-project command which will allow you to create a new blank Mezzanine environment within a new directory, and in this case we will also send an option to instruct the command to install the Cartridge module as well.
mezzanine-project -a cartridge new_cartridge_project
At this point you will have a blank Mezzanine environment with the cartridge module installed, now the Django database must be populated with the model information for the application, and then the Django application server will be started.
cd new_cartridge_project
python manage.py createdb
python manage.py runserver
If all went well, you should see the startup messages for the Django application server which will list the version numbers of the various libraries it will use, and then should be up and listening on the loopback interface. At this point to complete the setup you would just need to point your httpd at this loopback socket to send connections there.

Cartridge Product Models

Products within Cartridge are defined within the python models for the application, there are three primary models to be concerned with in a Product definition, the Product, ProductVariation and ProductImage models specifically.

  • Product - Defines the primary attributes for a product like name, price, SKU, and can be populated with optional fields for sale prices, etc.
  • ProductVariation - Defines a variant of a product SKU, these would be most commonly used for things like product sizes and colors.
  • ProductImage - Defines the image for the picture of each product.
Each of these three models exist within Django as their own data sets, but then they have foreign key references from each Product to all of its Variants and Images. While you can alter these model definitions using the django shell and other methods within python, you can also update and manage the products within the admin interface for the django site.

Adding Products / Tracking Orders

Adding products through the admin interface for Django was relatively easy and quick:


Within this interface you would provide the details for the products name, published status, date range for the product to be on the site, and product description in addition to any Variants or Images for the product. This is where the Mezzanine foundation of Cartridge start to show through, where this process mirrors the creating of a content page within Mezzanine, but has added these attributes of the product to the definition.

In the same way, Product Variants, Discount Codes, and Sales can be created in much the same way within the admin interface. In this way, once you had a basic Cartridge setup in place on your server, within most use cases for setting up a simple web store, Cartridge would remove the need to do any further hacking of python code, and would allow any user familiar with a CMS workflow for creating and managing the objects within the environment to manage the store.

In addition, Cartridge provides order management within the admin interface as well:


From this interface you can update orders, create new orders, print the PDF invoices for the orders and track whether orders have been fulfilled and shipped to the customer.

Conclusions

Like other Ecommerce applications built on top of a Content Management System, Cartridge and Mezzanine work well together to create a stack that allows for both ease of getting started with the system, and a massive degree of customization possible. Each component of Cartridge is defined within Django, and can be rewritten and customized to serve any number of use cases. The setup process for Cartridge was pretty straightforward, but one recommendation I can make is that once you have a working system up, you should place all of the files for Mezzanine and Cartridge into a source code management system like Git. Like other web applications of this type, the complexity of each change to the system can create a possible point of failure. Once you start making changes to the Mezzanine configuration files and the python for the Models and Views for Cartridge, controlling those changes becomes critical. In this way Mezzanine and Cartridge are spanning the normal continuum of software with the polar extremes of ease of use and customization by providing a good example to start with in the setup, and also exposing the full range of customization to the developer.


published by noreply@blogger.com (Greg Davidson) on 2014-12-02 18:43:00 in the "devtools" category

Microsoft recently announced a new service which I'm finding very useful. RemoteIE lets you test your sites with IE (currently version 11) on Windows 10 Technical Preview. The service runs in Azure RemoteApp which is available for several clients including Android, iOS and Windows Phone. What's great about this is that you do not have to install and maintain your own virtual machine with VirtualBox or VMWare.

RemoteIE

To use RemoteIE you'll need a valid Microsoft account — it's easy to sign up if you don't already one. Once you have an account and have downloaded & installed the Azure RemoteApp client of your choice it's just a matter of starting it up and logging in. Happy Testing!


published by Eugenia on 2014-12-01 06:20:36 in the "General" category
Eugenia Loli-Queru

I recently started exercising again (I’ll be doing endurance, weights, and sprints). My lungs are pushed to their limits, so I need to be on top of my game everyday. To accomplish this, I became much more strict with my diet. Not everybody has to be as strict though. This below is the diet I’d personally do to maximize casual sports performance. Disclaimer: I’m not a nutritionist.

Processed foods and ‘weird’ ingredients: None. Not even gluten-free processed foods.
Grains: None, except for fermented/sprouted brown rice (I personally do none at all). Have such rice only up to one or two meals per week.
Pseudograins: None, except for soaked quinoa, up to once a week (I personally do none at all, irritates my gut).
Dairy: Fermented-only, e.g. European style full-fat yogurt, cheese, sour cream, home-made full-fat goat kefir (I personally do none anymore).
Eggs: Pastured-only (or at the very least free-range organic). 7-10 eggs a week.
Nuts & Seeds: All are ok except peanuts, but they have to be raw, and then soaked (each type has different soaking times, look it up online). After draining them, these nuts & seeds keep up to 3 days in the fridge. Eat these only up to a small handful a day.
Flours: Almond, coconut and a bit of tapioca flour is permitted only in a few, special occasions in the year. Avoid all these “paleo desserts/breads”. These don’t really exist in this diet.
Legumes: All are ok except soy, but they have to be soaked in water for 24 hours before cooking them in high heat. Check out lentils carefully, they tent to have barley and small stones among them. Gluten-free tamari soy sauce is ok to use. Have legumes up to twice a week.
Fats: Coconut oil, avocado oil, olive oil, grass-fed butter, animal fats. Don’t restrict these healthy fats. Instead, avoid all trans-fats and most vegetable seed oils (particularly margarine).
Veggies: All, including white potatoes and other tubers/roots. 2/3s of what you eat daily in weight, should be veggies, and some of it should be raw. Include many kinds of new veggies in your diet, even the ones you never had before (e.g. sorrel, raw young garlic etc). Also eat fermented foods, e.g. unpasteurized kimchi, sauerkraut, water kefir.
Sea vegetables: Yes, from parts of the ocean that are clean. A tablespoon of various types of seaweed, daily (prepare it yourself to ensure they’re gluten-free, or get a GF one from Costco).
Fruits: All in season. Juices are not permitted. Smoothies are permitted, but the whole fruit goes in it, not just the sugary juice.
Fish: All, particularly the fatty & smaller ones (low in mercury). They all must be wild. Best options are wild Alaskan salmon and wild sardines with bones (canned ok).
Shellfish: All. They don’t have to be wild. Canned are ok too. If they’re in vegetable seed oil, strain them, and add lemon. The most nutritionally-dense shellfish is oysters.
Meat: All, but only wild or pastured/grass-fed (3-4 times a week). Seafood in this diet is the primary source of protein.
Offal: All, from pastured animals, once or twice a week. Liver twice a month. Heart has lots of CoQ10 and PQQ, which is great for mitochondria support.
Sugars: None, not even artificial. Some local, raw, unfiltered honey is permitted in RAW desserts (or added in gelatin-based but otherwise raw desserts). Organic brown sugar is permitted only if you’re using it for your water kefir (the kefir bacteria will eat it up during fermentation, so sugar is not an issue in this case). Don’t use honey in water kefir (honey kills the kefir bacteria).
Salt: Yes, Celtic sea salt. Avoid Himalayan, because while it’s otherwise very healthy, it’s also very high in fluoride (a little known fact).
Spices: Yes, particularly turmeric and ginger. Avoid ready-made mixes of many different spices, because these tend to have wheat fillers.
Bone Broth: At least 1 cup, daily (reheated and drank as-is, or used in soups). Slow-cook it for 24 hours with filtered water, pastured bones, 1/4 of a lemon OR 1 tablespoon of apple cider vinegar, and celery/carrots/onion only at the last two hours of the cooking process. Keeps up to a week in the fridge in jars, or up to a month in plastic bags in the freezer. Bone broth will keep your joints and knees healthy during exercise.
Coffee: Avoid all caffeine. It’ll take you two weeks to become free from it (drink less and less every day). Expect withdrawal symptoms, like headaches.
Teas: Herbal-only. Best are chamomile, and particularly Greek Mountain Tea (found in Mediterranean stores only).
Alcohol: If you have to party, rarely have some cider, or some wine. Particularly avoid beer at all costs because of its gluten (I personally do none at all).
Vitamins: The modern food and water don’t provide as much nutrition per weight as they did in the paleolithic times. Take D3 with breakfast, and get its level up to 80 ng/ml (test yearly to make sure you don’t overdose). Magnesium 3-4 times a week, 20 mins before bed. CoQ10 Ubiquinol 3 times a week, and K2-Mk4 2-3 times a week (lunch time). B1 only once or twice a month, with dinner. B12-sublingual or other vitamins only if a blood test reveals that you’re low. Avoid multi-vitamins.
Drugs: Avoid NSAIDs, steroids and other drugs that aren’t absolutely essential for your health condition (always with your doctor’s approval).
Water: Lots of it, and with no fluoride added (so avoid most tap water).

Regarding GMO/Organic: for some veggies/fruits it makes sense to get them as non-GMO and organic, for others it doesn’t. It depends on the kind of veggie/fruit we’re talking about. There are lists about this topic online.

Typical Diet:
Breakfast, after exercise: 1 egg, kimchi, seaweed salad, reheated baked sweet potato with butter on it, a reheated cup of bone broth, fruits. The sweet potato will refuel you.
Lunch: Seafood (I usually open a shellfish or sardines can for lunch) with green veggies or salad, cheese, and fruits or smoothie.
Snack: Chamomile, or almond milk, or home-made milk kefir, or a handful of soaked nuts/seeds.
Dinner: Seafood or meat/offal with veggies and 1 cup of starches. Also, salad, and fruits or yogurt with berries.

Regarding the mentioned starch at dinner time: use only for the days your exercise. For example:
Mon: Beans (soaked)
Tue: White Potato
Wed: Rice (sprouted)
Thu: White Potato or parsnips and other starchy roots
Fri: Lentils (soaked)
Sat: Quinoa (soaked)

On Sundays, presumably the only day you don’t exercise (change it around if you are), go lower in starch:
Breakfast: Just a smoothie.
Lunch: Semi-starchy roots (e.g. carrots, beets, rutabaga, turnips), along the rest of the lunch.
Snack: Bone broth, or herbal tea.
Dinner: Green veggies (low carb), along the rest of the dinner.

This is a lower carb diet compared to the western diet, but it’s not very low carb, and it’s definitely not ketogenic. Don’t count calories/carbs. Just eat as much as you need, and have enough of the right fats (don’t skip them).

Fuel with more starch than usual for two days before an official race/game.

General info:
Get recipes using approved ingredients, here (minus the non-raw desserts & breads shown there, but plus the soaked legumes/quinoa/rice).

For pasta lovers, this is your pasta from now on. I personally prefer these over real pasta (especially the one made out of zucchini).

For pizza lovers, there are options, but none taste authentic I’m afraid. I’d suggest some rice-based pizza dough, in very special occasions only (Trader Joe’s has one that’s a bit acceptable).

How to do this cheaply in the US: veggies/fruits from Asian/Mexican stores, and the Farmer’s Market. Costco for seaweed salad, kimchi, organic ground beef. Trader Joe’s for raw nuts, canned sardines, canned Alaskan salmon, canned oysters. Dollar Store for other canned shellfish. Asian stores for shellfish, rabbit, duck gizzards, wild white fish. Pastured meat/offal/eggs is going to be expensive no matter what (although most goat/lamb is pastured in the US, find it in Mediterranean shops at acceptable prices).

Finally…

You’ll do worse before you do much better. It’ll take 3-4 weeks to adjust to this diet. During that time you will also be detoxing from sugar, gluten etc, so it’s to be expected that you will under-perform. But when that part is done, you should get much better than before, and quickly.

Good luck!


published by noreply@blogger.com (David Christensen) on 2014-11-27 02:33:00 in the "bash" category
I recently was involved in a project to migrate a client's existing application from MySQL to PostgreSQL, and I wanted to record some of my experiences in doing so in the hopes they would be useful for others.

Note that these issues should not be considered exhaustive, but were taken from my notes of issues encountered and/or things that we had to take into consideration in this migration process.

Convert the schema

The first step is to convert the equivalent schema in your PostgreSQL system, generated from the original MySQL.

We used `mysqldump --compatible=postgresql --no-data` to get a dump which matched PostgreSQL's quoting rules. This file still required some manual editing to cleanup some of the issues, such as removing MySQL's "Engine" specification after a CREATE TABLE statement, but this resulted in a script in which we were able to create a skeleton PostgreSQL database with the correct database objects, names, types, etc.

Some of the considerations here include the database collations/charset. MySQL supports multiple collations/charset per database; in this case we ended up storing everything in UTF-8, which matched the encoding of the PostgreSQL database, so there were no additional changes needed here; otherwise, it would have been necessary to note the original encoding of the individual tables and later convert that to UTF-8 in the next step.

We needed to make the following modifications for datatypes:

MySQL Datatype PostgreSQL Datatype
tinyint int
int(NN) int
blob bytea*
datetime timestamp with timezone
int unsigned int**
enum('1') bool
longtext text
varbinary(NN) bytea

* Note: we ended up converting these specific fields to text, just given the data that was stored in these fields in actuality, which just goes to show you should review your data.

** Note: because PostgreSQL does not have unsigned numeric types, if this feature is an important part of your data model you can/should add a CHECK constraint to the column in question to check that the value is non-negative.

A few other syntactic changes; MySQL's UNIQUE KEY in the CREATE TABLE statement needs to just be UNIQUE.

Some of the MySQL indexes were defined as FULLTEXT indexes as well, which was a keyword PostgreSQL did not recognize. We made note of these, then created just normal indexes for the time being, intending to review to what extent these actually needed full text search capabilities.

Some of the AUTO_INCREMENT fields did not get the DEFAULT value set correctly to a sequence, because those types just ended up as integers without being declared a serial field, so we used the following query to correct this:

-- cleanup missing autoincrement fields

WITH
datasource AS (
    SELECT
        k.table_name,
        k.column_name,
        atttypid::regtype,
        adsrc
    FROM
        information_schema.key_column_usage k
    JOIN
        pg_attribute
    ON
        attrelid = k.table_name :: regclass AND
        attname = k.column_name
    LEFT JOIN
        pg_attrdef
    ON
        adrelid = k.table_name :: regclass AND
        adnum = k.ordinal_position
    WHERE
        table_name IN (
            SELECT table_name::text FROM information_schema.key_column_usage WHERE constraint_name LIKE '%_pkey' GROUP BY table_name HAVING count(table_name) = 1
        ) AND
        adsrc IS NULL AND
        atttypid = 'integer' ::regtype
),
frags AS (
    SELECT
        quote_ident(table_name || '_' || column_name || '_seq') AS q_seqname,
        quote_ident(table_name) as q_table,
        quote_ident(column_name) as q_col
    FROM
        datasource
),
queries AS (
    SELECT
        'CREATE SEQUENCE ' || q_seqname || ';
' ||
        'ALTER TABLE ' || q_table || ' ALTER COLUMN ' || q_col || $$ SET DEFAULT nextval('$$ || q_seqname || $$');
    $$ ||
        $$SELECT setval('$$ || q_seqname || $$',(SELECT max($$ || q_col || ') FROM ' || q_table || '));
' AS query
    FROM frags
)
SELECT
    COALESCE(string_agg(query, E'n'),$$SELECT 'No autoincrement fixes needed';$$) AS queries FROM queries
gset

BEGIN;
:queries
COMMIT;

Basically the idea is that we look for all table with a defined integer primary key (hand-waving it it by using the _pkey suffix in the constraint name), but without a current default value, then generate the equivalent SQL to create a sequence and set that table's default value to the nextval() for the sequence in question. We also generate SQL to scan that table and set that sequence value to the next appropriate value for the column in question. (Since this is for a migration and we know we'll be the only user accessing these tables we can ignore MVCC.)

Another interesting thing about this script is that we utilize psql's ability to store results in a variable, using the gset command, then we subsequently execute this SQL by interpolating that corresponding variable in the same script.

Convert the data


The next step was to prepare the data load from a MySQL data-only dump. Using a similar dump recipe as for the initial import, we used: `mysqldump --compatible=postgresql --no-create-info --extended-insert > data.sql` to save the data in a dump file so we could iteratively tweak our approach to cleaning up the MySQL data.

Using our dump file, we attempted a fresh load into the new PostgreSQL database. This failed initially due to multiple issues, including ones of invalid character encoding and stricter datatype interpretations in PostgreSQL.

What we ended up doing was to create a filter script to handle all of the "fixup" issues needed here. This involved decoding the data and reencoding to ensure we were using proper UTF8, performing some context-sensitive datatype conversions, etc.

Additional schema modifications


As we were already using a filter script to process the data dump, we decided to take the opportunity to fixup some warts in the current table definitions. This included some fields which were varchar, but should have actually been numeric or integer; as this was a non-trivial schema (100 tables) we were able to use PostgreSQL's system views to identify a list of columns which should should be numeric and were currently not.

Since this was an ecommerce application, we identified columns that were likely candidates for data type reassignment based on field names *count, *qty, *price, *num.

Once we identified the fields in question, I wrote a script to generate the appropriate ALTER TABLE statements to first drop the default, change the column type, then set the new default. This was done via a mapping between table/column name and desired output type.

Convert the application


The final (and needless to say most involved step) was to convert the actual application itself to work with PostgreSQL. Despite the fact that these databases both speak SQL, we had to come up for solutions for the following issues:

Quotation styles


MySQL is more lax with its quoting styles, so some of this migration involved hunting down differences in quoting styles. The codebase contained lots of double-quoted string literals, which PostgreSQL interprets as identifiers, as well as the difference in quoting of column names (backticks for MySQL, double-quotes for PostgreSQL). These had to be identified wherever they appeared and fixed to use a consistent quoting style.

Specific unsupported syntax differences:


INSERT ON DUPLICATE KEY

MySQL supports the INSERT ON DUPLICATE KEY syntax. Modifying these queries involved creating a special UPSERT-style function to support the different options in use in the code base. We isolated and categorized the uses of INSERT ON DUPLICATE KEY UPDATE into several categories: those which did a straight record replace and those which did some sort of modification. I wrote a utility script (detailed later in this article) which served to replicate the logic needed to handle this as the application would expect.

Upcoming versions of PostgreSQL are likely to incorporate an INSERT ... ON CONFLICT UPDATE/IGNORE syntax, which would produce a more direct method of handling migration of these sorts of queries.

INSERT IGNORE

MySQL's INSERT ... IGNORE syntax allows you to insert a row and effectively ignore a primary key violation, assuming that the rest of the row is valid. You can handle this case via creating a similar UPSERT function as in the previous point. Again, this case will be easily resolved if PostgreSQL adopts the INSERT ... ON CONFLICT IGNORE syntax.

REPLACE INTO

MySQL's REPLACE INTO syntax effectively does a DELETE followed by an INSERT; it basically ensures that a specific version of a row exists for the given primary key value. We handle this case by just modifying these queries to do an unconditional DELETE for the Primary Key in question followed by the corresponding INSERT. We ensure these are done within a single transaction so the result is atomic.

INTERVAL syntax

Date interval syntax can be slightly different in MySQL; intervals may be unquoted in MySQL, but must be quoted in PostgreSQL. This project necessitated hunting down several instances to add quoting of specific literal INTERVAL instances.

Function considerations


last_insert_id()

Many times when you insert a records into a MySQL table, later references to this are found using the last_insert_id() SQL function. These sorts of queries need to be modified to utilize the equivalent functionality using PostgreSQL sequences, such as the currval() function.

GROUP_CONCAT()

MySQL has the GROUP_CONCAT function, which serves as a string "join" of sorts. We emulate this behavior in PostgreSQL by using the string_agg aggregate function with the delimiter of choice.

CONCAT_WS() - expected to be but not an issue; PG has this function

PostgreSQL has included a CONCAT_WS() function since PostgreSQL 9.1, so this was not an issue with the specific migration, but could still be an issue if you are migrating to an older version of PostgreSQL.

str_to_date()

This function does not exist directly in PostgreSQL, but can be simulated using to_date(). Note however that the format string argument differs between MySQL and PostgreSQL's versions.

date_format()

MySQL has a date_format() function which transforms a date type to a string with a given format option. PostgreSQL has similar functionality using the to_char() function; the main difference here lies in the format string specifier.

DateDiff()

DateDiff() does not exist in PostgreSQL, this is handled by transforming the function call to the equivalent date manipulation operators using the subtraction (-) operator.

rand() to random()

This is more-or-less a simple function rename, as the equivalent functionality for returning a random float between 0.0 <= x <= 1.0 exists in PostgreSQL and MySQL, it's just what the function name itself is. The other difference is that MySQL supports a scale argument so the random number for rand(N) will be returned between 0.0 <= x <= N, whereas you'd have to scale the result in PostgreSQL yourself, via random() * N.

IF() to CASE WHEN ELSE

MySQL has an IF() function which returns the second argument in the case the first argument evaluates to true otherwise returns the third argument. This can be trivially converted from IF(expression1, arg2, arg3) to the equilvalent PostgreSQL syntax: CASE WHEN expression1 THEN arg2 ELSE arg3.

IFNULL() to COALESCE()

MySQL has a function IFNULL() which returns the first argument if it is not NULL, otherwise it returns the second argument. This can effectively be replaced by the PostgreSQL COALESCE() function, which serves the same purpose.

split_part()

MySQL has a built-in function called split_part() which allows you to access a specific index of an array delimited by a string. PostgreSQL also has this function, however the split_part() function in MySQL allows the index to be negative, in which case this returns the part from the right-hand side.

in MySQL:
split_part('a banana boat', ' ', -1) => 'boat'
in PostgreSQL:
split_part('a banana boat', ' ', -1) => // ERROR:  field position must be greater than zero

I fixed this issue by creating a custom plpgsql function to handle this case. (In my specific case, all of the negative indexes were -1; i.e., the last element in the array, so I created a function to return only the substring occurring after the last instance of the delimiter.)

Performance considerations


You may need to revisit COUNT(*) queries

MySQL MyISAM tables have a very fast COUNT(*) calculation, owing to queries taking a table lock (which means MySQL can cache the COUNT(*) result itself, since there can be no concurrent updates), while PostgreSQL utilizes MVCC in order to calculate table counts which necessitates a full table scan to see which rows are visible to the specific calling snapshot, so this assumption may need to be revisited in order to calculate an equivalent performant query.

GROUP BY vs DISTINCT ON

MySQL is much more (*ahem*) flexible when it comes to GROUP BY/aggregate queries, allowing some columns to be excluded in a GROUP BY or an aggregate function. Making the equivalent query in PostgreSQL involves transforming the query from SELECT ... GROUP BY (cols) to a SELECT DISTINCT ON (cols) ... and providing an explicit sort order for the rows.

More notes

Don't be afraid to script things; in fact, I would go so far as to suggest that everything you do should be scripted. This process was complicated and there were lots of moving parts to ensure moved in tandem. There were changes being made on the site itself concurrently, so we were doing testing against a dump of the original database at a specific point-in-time. Having everything scripted ensured that this process was repeatable and testable, and that we could get to a specific point in the process without having to remember anything I'd done off-the-cuff.

In addition to scripting the actual SQL/migrations, I found it helpful to script the solutions to various classifications of problems. I wrote some scripts which I used to create some of the various scaffolding/boilerplate for the tables involved. This included a script which would create an UPSERT function for a specific table given the table name, which was used when replacing the INSERT ON DUPLICATE KEY UPDATE functions. This generated script could then be tailored to handle more complex logic beyond a simple UPDATE. (One instance here is an INSERT ON DUPLICATE KEY UPDATE which increased the count of a specific field in the table instead of replacing the value.)

#!/usr/bin/env perl
# -*-cperl-*-

use strict;
use warnings;

use Data::Dumper;

my $table = shift or die "Usage: $0 <table>n";
my @cols = @ARGV;

my $dbh = DBI->connect(...);

my @raw_cols = @{ $dbh->column_info(undef, 'public', $table, '%')->fetchall_arrayref({}) };
my %raw_cols = map { $_->{COLUMN_NAME} => $_ } @raw_cols;

die "Can't find table $tablen" unless @raw_cols;

my @missing_cols = grep { ! defined $raw_cols{$_} } @cols;

die "Referenced non-existing columns: @missing_colsn" if @missing_cols;

my %is_pk;

unless (@cols) {
    @cols = map { $_->{COLUMN_NAME} } @raw_cols;
}

my @pk_cols = $dbh->primary_key(undef, 'public', $table);

@is_pk{@pk_cols} = (1)x@pk_cols;

my @data_cols = grep { ! $is_pk{$_} } @cols;

die "Missing PK cols from column list!n" unless @pk_cols == grep { $is_pk{$_} } @cols;
die "No data columns!n" unless @data_cols;

print <<EOF
CREATE OR REPLACE FUNCTION
    upsert_$table (@{[
    join ', ' => map {
        "_$_ $raw_cols{ $_ }->{pg_type}"
    } @cols
]})
RETURNS void
LANGUAGE plpgsql
AS $EOSQL$
BEGIN
    LOOP
        UPDATE $table SET @{[
    join ', ' => map { "$_ = _$_" } @data_cols
]} WHERE @{[
    join ' AND ' => map { "$_ = _$_" } @pk_cols
]};
        IF FOUND THEN
            RETURN;
        END IF;
        BEGIN
            INSERT INTO $table (@{[join ',' => @cols]}) VALUES (@{[join ',' => map { "_$_" } @cols]});
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            -- Do nothing, and loop to try the UPDATE again.
        END;
    END LOOP;
END
$EOSQL$
;
EOF
;

This script created an upsert function from a given table to update all columns by default, also allowing you to create one with a different number of columns upserted.

I also wrote scripts which could handle/validate some of the column datatype changes. Since there were large numbers of columns which were changed, often multiple in the same table, I was able to have this script create a single ALTER TABLE statement with multiple ALTER COLUMN TYPE USING clauses, plus be able to specify the actual method that these column changes were to take place. These included several different approaches, depending on the target data type, but generally were to solve cases where there were fairly legitimate data that was not picked up by PostgreSQL's input parsers. These included how to interpret blank fields as integers (in some cases we wanted it to be 0, in others we wanted it to be NULL), weird numeric formatting (leaving off numbers before or after the decimal point), etc.

We had to fix up in several locations missing defaults for AUTO_INCREMENT columns. The tables were created with the proper datatype, however we had to find tables which matched a specific naming convention and create/associate a sequence/serial column, set the proper default here, etc. (This was detailed above.)

There was a fair amount of iteration and customization in this process, as there was a fair amount of data which was not of the expected format. The process was iterative, and generally involved attempting to alter the table from within a transaction and finding the next datum which the conversion to the expected type did not work. This would result in a modification of the USING clause of the ALTER TABLE ALTER COLUMN TYPE to accommodate some of the specific issues.

In several cases, there were only a couple records which had bad/bunko data, so I included explicit UPDATE statements to update those data values via primary key. While this felt a bit "impure", it was a quick and preferred solution to the issue of a few specific records which did not fit general rules.


published by noreply@blogger.com (Greg Sabino Mullane) on 2014-11-24 16:42:00 in the "postgres" category

One of the more frightening things you can run across as a DBA (whether using Postgres or a lesser system) is a crash followed by a complete failure of the database to start back up. Here's a quick rundown of the steps one could take when this occurs.

The first step is to look at why it is not starting up by examining the logs. Check your normal Postgres logs, but also check the filename passed to the --log argument for pg_ctl, as Postgres may not have even gotten far enough to start normal logging. Most of the time these errors are not serious, are fairly self-explanatory, and can be cured easily - such as running out of disk space. When in doubt, search the web or ask in the #postgresql IRC channel and you will most likely find a solution.

Sometimes the error is more serious, or the solution is not so obvious. Consider this problem someone had in the #postgresql channel a while back:

LOG: database system was interrupted while in recovery at 2014-11-03 12:43:09 PST
HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 1883/AF9458E8
LOG: unexpected pageaddr 1882/BAA7C000 in log file 6275, segment 189, offset 10993664
LOG: redo done at 1883/BDA7A9A8
LOG: last completed transaction was at log time 2014-10-25 17:42:53.836929-07
FATAL: right sibling's left-link doesn't match: block 6443 links to 998399 instead of expected 6332 in index "39302035"

As you can see, Postgres has already hinted you may be in deep trouble with its suggestion to use a backup. The Postgres daemon completely fails to start because an index is corrupted. Postgres has recognized that the B-tree index no longer looks like a B-tree should and bails out.

For many errors, the next step is to attempt to start Postgres in single-user mode. This is similar to "Safe mode" in Windows - it starts Postgres in a simplified, bare-bones fashion, and is intended primarily for debugging issues such as a failed startup. This mode is entered by running the 'postgres' executable directly (as opposed to having pg_ctl do it), and passing specific arguments. Here is an example:

$ /usr/bin/postgres --single -D /var/lib/pgsql93/data -P -d 1

This starts up the 'postgres' program (used to be 'postmaster'), enters single-user mode, specifies where the data directory is located, turns off system indexes, and sets the debug output to 1. After it is run, you will have a simple prompt. From here you can fix your problem, such as reindexing bad indexes, that may have caused a normal startup to fail. Use CTRL-d to exit this mode:

$ /usr/bin/postgres --single -D /var/lib/pgsql93/data -P -d 1
NOTICE:  database system was shut down at 2014-11-20 16:51:26 UTC
DEBUG:  checkpoint record is at 0/182B5F8
DEBUG:  redo record is at 0/182B5F8; shutdown TRUE
DEBUG:  next transaction ID: 0/1889; next OID: 12950
DEBUG:  next MultiXactId: 1; next MultiXactOffset: 0
DEBUG:  oldest unfrozen transaction ID: 1879, in database 1
DEBUG:  oldest MultiXactId: 1, in database 1
DEBUG:  transaction ID wrap limit is 2147485526, limited by database with OID 1
DEBUG:  MultiXactId wrap limit is 2147483648, limited by database with OID 1

PostgreSQL stand-alone backend 9.3.5
backend> [CTRL-d]
NOTICE:  shutting down
NOTICE:  database system is shut down

If you are not able to fix things with single-user mode, it's time to get serious. This would be an excellent time to make a complete file-level backup. Copy the entire data directory to a different server or at least a different partition. Make sure you get everything in the pg_xlog directory as well, as it may be symlinked elsewhere.

Time to use pg_resetxlog, right? No, not at all. Use of the pg_resetxlog utility should be done as an absolute last resort, and there are still some things you should try first. Your problem may have already been solved - so the next step should be to upgrade Postgres to the latest revision. With Postgres, a revision (the last number in the version string) is always reserved for bug fixes only. Further, changing the revision is almost always as simple as installing a new binary. So if you are running Postgres version 9.0.3, upgrade to the latest in the 9.0 series (9.0.18 as of this writing). Check the release notes, make the upgrade, and try to start up Postgres.

Still stumped? Consider asking for help. For fast, free help, try the #postgresql IRC channel. For slightly slower free help, try the pgsql-general mailing list. For both of these options, the majority of the subscribers are clustered near the US Eastern time zone, so response times will be faster at 3PM New York time versus 3AM New York time. For paid help, you can find a Postgres expert (such as End Point!) at the list of professional services at postgresql.org,

The next steps depend on the error, but another route is to hack the source code for Postgres to work around the error preventing the startup. This could mean, for example, changing a FATAL exception to an WARNING, or other trickery. This is expert-level stuff, to be sure, but done carefully can still be safer than pg_resetxlog. If possible, try this on a copy of the data!

If you have done everything else, it is time to attempt using pg_resetxlog. Please make sure you read the manual page about it before use. Remember this is a non-reversible, possibly data-destroying command! However, sometimes it is the only thing that will work.

If you did manage to fix the problem - at least enough to get Postgres to start - the very next item is to make a complete logical backup of your database. This means doing a full pg_dump right away. This is especially important if you used pg_resetxlog. Dump everything, then restore it into a fresh Postgres cluster (upgrading to the latest revision first if needed!). The pg_dump will not only allow you to create a clean working version of your database, but is a great way to check on the integrity of your data, as it by necessity examines every row of data you have. It will *not* check on the sanity of your indexes, but there are other ways to do that, the simplest being to do a REINDEX DATABASE on each database in your cluster.

All of these steps, including pg_resetxlog, may or may not help. In the "left-link doesn't match" example at the top, nothing was able to fix the problem (not single-user mode, nor a more recent revision, nor pg_resetxlog). It's possible that the data could have been recovered by hacking the source code or using tools to extract the data directly, but that was not necessary as this was a short-lived AWS experiment. The consensus was it was probably a hardware problem. Which goes to show that you can never totally trust your hardware or software, so always keep tested, frequent, and multiple backups nearby!


published by noreply@blogger.com (Kulbir Singh) on 2014-11-24 07:56:00 in the "django" category

Twilio


Twilio is a powerful HTTP API that allows you to build powerful voice and SMS apps. The goal of this blog post is to help make building the SMS applications as simple as possible in django.

There is a already Twilio Python help library available. The open source twilio-python library lets us to write python code to make HTTP requests to the Twilio API.

Installation


The easiest way to install twilio-python library is using pip. Pip is a package manager for python.

Simply run following command in terminal.

$ pip install twilio

Twilio API Credentails

To Integrate twilio API in django application, we need TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN variables. These variables can be found by logging into your Twilio account dashboard. These variables are used to communicate with the Twilio API.

You?ll need to add them to your settings.py file:

TWILIO_ACCOUNT_SID = 'ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
TWILIO_AUTH_TOKEN = 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY'

Create a New App


We are going to interact with people using SMS, so I prefer to create an app named communication. I am assuming, you've already installed Django.

Run following command in terminal.

$ django-admin.py startapp communcation

We will need to register the new app in our django project.
Add it to your INSTALLED_APPS tuple in your settings.py file:

INSTALLED_APPS = (
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.sites',
'django.contrib.messages',
'django.contrib.staticfiles',

?communication?,
...
)

Create the Model


Now we?ll open up communication/models.py to start creating models for our app.

class SendSMS(models.Model):
    to_number = models.CharField(max_length=30)
    from_number = models.CharField(max_length=30)
    sms_sid = models.CharField(max_length=34, default="", blank=True)
    account_sid = models.CharField(max_length=34, default="", blank=True)
    created_at = models.DateTimeField(auto_now_add=True)
    sent_at = models.DateTimeField(null=True, blank=True)
    delivered_at = models.DateTimeField(null=True, blank=True)
    status = models.CharField(max_length=20, default="", blank=True)


and run the syncdb command after defining the model:

$ python manage.py syncdb

It will create the necessary database tables for our app.

Create utils.py file


Create a new file named utils.py and save in communication/utils.py.

Put the following code in communication/utils.py:

from django.conf import settings

import twilio
import twilio.rest


def send_twilio_message(to_number, body):
    client = twilio.rest.TwilioRestClient(
        settings.TWILIO_ACCOUNT_SID, settings.TWILIO_AUTH_TOKEN)

    return client.messages.create(
        body=body,
        to=to_number,
        from_=settings.TWILIO_PHONE_NUMBER
    )


Testing send_twilio_message


Open the shell and run following commands.

>>> from communication.utils import send_twilio_message
>>> sms = send_twilio_message('+15005550006', 'Hello Endpointer,')
>>> print sms.sid
SM97f8ac9321114af1b7fd4463ff8bd038

Having the sid means that everything in the backend is working fine. And we can proceed to work on the front end.

Create Form

Lets create a form to gather the data.  Now open/create up communication/forms.py to start creating forms for our app. And paste the following code into it:

class SendSMSForm(forms.ModelForm):

    class Meta:
        model = SendSMS
        fields = ('to_number', 'body')


The View CreateView


class SendSmsCreateView(CreateView):
    model = SendSMS
    form_class = SendSMSForm
    template_name = 'communication/sendsms_form.html'
    success_url = reverse_lazy('send_sms')

    def form_valid(self, form):
        number = self.cleaned_data['to_number']
        body = self.cleaned_data['body']
        # call twilio
        sent = send_twilio_message(number, body)

        # save form
        send_sms = form.save(commit=False)
        send_sms.from_number = settings.TWILIO_PHONE_NUMBER
        send_sms.sms_sid = sent.sid
        send_sms.account_sid = sent.account_sid
        send_sms.status = sent.status
        send_sms.sent_at = now()
        if sent.price:
            send_sms.price = Decimal(force_text(sent.price))
            send_sms.price_unit = sent.price_unit
        send_sms.save()

    return super(SendSmsCreateView, self).form_valid(form)



Defining URLS

The URL configuration tells Django how to match a request?s path to your Python code. Django looks for the URL configuration, defined as urlpatterns, in the urls.py file in your project:

from django.conf.urls import patterns, url

from .views import SendSmsCreateView

urlpatterns = patterns('',
    url(
        regex=r'^communication/send/sms/$',
        view=SendSmsCreateView.as_view(),
        name='send_sms'
    ),
)

Creating the Template

Now that we?ve defined a URL for our list view, we can try it out. Django includes a server suitable for development purposes that you can use to easily test your project:

If you visit the http://127.0.0.1:8000/communication/send/sms/ in your browser, though, you?ll see an error: TemplateDoesNotExist.

This is because we have not defined the template file yet. So now create sendsms_form.html file in templates/communication/ and put the following code in it:

{% csrf_token %} {% for field in form %}
{{ field }} {{ field.errors }}
{% endfor %}

Now reload the http://127.0.0.1:8000/communication/send/sms/ in your browser. Assuming everything is okay, you should then see the following form:


Fill out the form, and hit the submit button to send your SMS.

CONCLUSION

Congratulations, your SMS is successfully sent. Good luck!

published by noreply@blogger.com (Selvakumar Arumugam) on 2014-11-19 16:02:00 in the "big-data" category
The 11th edition of Open Source India, 2014 was held at Bengaluru, India. The two day conference was filled with three parallel tech talks and workshops which was spread across various Open Source technologies.

IMG_20141110_223543.jpg

In-depth look at Architecting and Building solutions using MongoDB

Aveekshith Bushan & Ranga Sarvabhouman from MongoDB started off the session with a comparison of the hardware cost involved with storage systems in earlier and recent days. In earlier days, the cost of storage hardware was very expensive, so the approach was to filter the data to reduce the size before storing into the database. So we were able to generate results from filtered data and we didn?t have option to process the source data. After the storage became cheap, we can now store the raw data and then we do all our filter/processing and then distribute it.

Earlier,
        Filter -> Store -> Distribute
Present,
        Store -> Filter -> Distribute

Here we are storing huge amount of data, so we need a processing system to handle and analyse the data in efficient manner. In current world, the data is growing like anything and 3Vs are phenomenal of growing (Big)Data. We need to handle the huge Volume of Variety of data in a Velocity. MongoDB follows certain things to satisfy the current requirement.

MongoDB simply stores the data as a document without any data type constraints which helps to store huge amount of data quickly. It leaves the constraints checks to the application level to increase the storage speed in database end. But it does recognises the data type after the data is stored as document. In simple words, the philosophy is: Why do we need to check the same things (datatype or other constraints) in two places (application and database)?

MongoDB stores all relations as single document and fetches the data in single disk seek. By avoiding multiple disk seeks, this results in the fastest retrieval of data. Whereas in relational database the relations stored in different tables which leads to multiple disk seek to retrieve the complete data of an entity. And MongoDB doesn?t support joins but it have Reference option to refer another collection(Table) without imposing foreign key constraints.

As per db-engines rankings, MongoDB stays in the top of NoSQL database world. Also it provides certain key features which I have remembered from the session:
    • Sub-documents duplicates the data but it helps to gain the performance(since the storage is cheap, the duplication doesn?t affect much)
    • Auto-sharding (Scalability)
    • Sharding helps parallel access to the system
    • Range Based Sharding 
    • Replica Sets (High availability)
    • Secondary indexes available
    • Indexes are single tunable part of the MongoDB system 
    • Partition across systems 
    • Rolling upgrades
    • Schema free
    • Rich document based queries
    • Read from secondary
When do you need MongoDB?
    • The data grows beyond the system capacity in relational database
    • In a need of performance in online requests
Finally, speakers emphasized to understand use case clearly and choose right features of MongoDB to get effective performance.

OpenStack Mini Conf


A special half day OpenStack mini conference was organised at second half of first day. The talks were spread across basics to in depth of OpenStack project. I have summarised all the talks here to give an idea of OpenStack software platform.

OpenStack is a Open Source cloud computing platform to provision the Infrastructure as a Service(IaaS). There is a wonderful project DevStack out there to set up the OpenStack on development environment in easiest and fastest way. A well written documentation of the OpenStack project clearly explains everything. In addition, anyone can contribute to OpenStack with help of How to contribute guide, also project uses Gerrit review system and Launchpad bug tracking system.

OpenStack have multiple components to provide various features in Infrastructure as a Service. Here is the list of OpenStack components and the purpose of each one.

Nova (Compute) - manages the pool of computer resources
Cinder (Block Storage) - provides the storage volume to machines
Neutron (Network) - manages the networks and IP addresses
Swift (Object Storage) - provides distributed high availability(replication) on storage system.
Glance (Image) - provides a repository to store disk and server images
KeyStone (Identity) - enables the common authentication system across all components
Horizon (Dashboard) - provides GUI for users to interact with OpenStack components
Ceilometer (Telemetry) - provides the services usage and billing reports
Ironic (Bare Metal) - provisions bare metal instead of virtual machines
Sahara (Map Reduce) - provisions hadoop cluster for big data processing

OpenStack services are usually mapped to AWS services to better understand the purpose of the components. The following table depicts the mapping of similar services in OpenStack and AWS:

OpenStack
AWS
Nova
EC2
Cinder
EBS
Neutron
VPC
Swift
S3
Glance
AMI
KeyStone
IAM
Horizon
AWS Console
Ceilometer
Cloudwatch
Sahara
Elastic Mapreduce

Along with the overview of OpenStack architecture, there were couple of in-depth talks which are listed below with slides.
That was a wonderful Day One of OSI 2014 which helped me to get better understanding of MongoDB and OpenStack.

published by noreply@blogger.com (Marco Manchego) on 2014-11-18 01:16:00 in the "company" category

End Point Corporation tem o prazer de anunciar o lançamento oficial do seu novo website em Português! O site, http://liquidgalaxy.pt.endpoint.com/ oficialmente sinaliza a chegada do Liquid Galaxy da End Point ao Brasil e tem como objetivo fornecer serviço a todos os atuais e futuros clientes em um dos maiores e mais dinâmicos mercados da América do Sul.

Com uma população de mais 200 milhões, o Brasil também é um rápido adoptante de novas tecnologias com um numero considerável de líderes do setor que podem beneficiar diretamente a implementação do Liquid Galaxy. Isto inclui um setor de commodities solido, uma expansão imobiliária cresente, turismo e um mercado de mídia vibrante, todos fortes candidatos para a nova tecnologia.

Brasil também é o ponto de entrada para o mercado sul-americano em geral. Estamos confiantes de que podemos aumentar a penetração no mercado Brasileiro, outras oportunidades na região irão seguir. Dave Jenkins, nosso vice-presidente de vendas e Marketing, oferece o seguinte: "nós estamos excitados para ver essa expansão no Brasil. Eu sempre vejo grandes coisas saindo de São Paulo e Rio, sempre participo das conferências tecnologicas, que estão sempre superlotadas.

Se você gostaria de saber mais sobre esta tecnologia, por favor contacte-nos: vendas@endpoint.com

published by noreply@blogger.com (Marco Manchego) on 2014-11-18 01:16:00 in the "company" category

End Point Corporation is pleased to announce the official launch of its new Brazilian Portuguese Liquid Galaxy website! The site, found at http://liquidgalaxy.pt.endpoint.com/ officially signals the arrival of End Point?s Liquid Galaxy to Brazil, and aims to provide service to all current and future customers in what is South America?s largest and most dynamic market.

With a population over 200 million, Brazil is also a quick adopter of new technologies with sizeable industry sectors that can benefit directly from the implementation of a Liquid Galaxy. This includes a massive commodities sector, booming real estate, tourism and a vibrant media market, all of which are strong candidates for the technology.

Brazil is also a logical entry-point into the larger South American market in general. We?re confident that as we increase market penetration in Brazil, other opportunities in the region will soon follow. Dave Jenkins, our VP of Sales and Marketing, offers the following: ?We?re excited to see this expansion into Brazil. I always see great things coming out of São Paulo and Rio whenever I go there for tech conferences, which are always booked to overflowing levels.?

If you have international business in South America, or are based in Brazil and would like to know more about this great technology, please contact us at vendas@endpoint.com


published by noreply@blogger.com (Bianca Rodrigues) on 2014-11-14 17:39:00 in the "e-commerce" category

Introduction

I recently started working with Spree and wanted to learn how to implement some basic features. I focused on one of the most common needs of any e-commerce business - adding a sale functionality to products. To get a basic understanding of what was involved, I headed straight to the Spree Developer Guides. As I was going through the directions, I realized it was intended for the older Spree version 2.1. This led to me running into a few issues as I went through it using Spree's latest version 2.3.4. I wanted to share with you what I learned, and some tips to avoid the same mistakes I made.

Set-up

I'll assume you have the prerequisites it lists including Rails, Bundler, ImageMagick and the Spree gem. These are the versions I'm running on my Mac OS X:
  • Ruby: 2.1.2p95
  • Rails: 4.1.4
  • Bundler: 1.5.3
  • ImageMagick: 6.8.9-1
  • Spree: 2.3.4

What is Bundler? Bundler provides a consistent environment for Ruby projects by tracking and installing the exact gems and versions that are needed. You can read more about the benefits of using Bundler on their website. If you're new to Ruby on Rails and/or Spree, you'll quickly realize how useful Bundler is when updating your gems.

After you've successfully installed the necessary tools for your project, it's time to create our first Rails app, which will then be used as a foundation for our simple Spree project called mystore

Let's create our app

Run the following commands:

$ rails new mystore
$ cd mystore
$ gem install spree_cmd

*Note: you may get a warning that you need to run bundle install before trying to start your application since spree_gateway.git isn't checked out yet. Go ahead and follow those directions, I'll wait.

Spree-ify our app

We can add the e-commerce platform to our Rails app by running the following command:

spree install --auto-accept

If all goes well, you should get a message that says, "Spree has been installed successfully. You're all ready to go! Enjoy!". Now the fun part - let's go ahead and start our server to see what our demo app actually looks like. Run rails s to start the server and open up a new browser page pointing to the URL localhost:3000.
*Note - when you navigate to localhost:3000, watch your terminal - you'll see a lot of processes running in the background as the page loads simultaneously in your browser window. It can be pretty overwhelming, but as long as you get a "Completed 200 OK" message in your terminal, you should be good to go! See it below:


Our demo app actually comes with an admin interface ready to use. Head to your browser window and navigate to http://localhost:3000/admin. The login Spree instructs you to use is spree@example.com and password spree123.

Once you login to the admin screen, this is what you should see:


Once you begin to use Spree, you'll soon find that the most heavily used areas of the admin panel include Orders, Products, Configuration and Promotions. We'll be going into some of these soon.

Extensions in 3.5 steps

The next part of the Spree documentation suggests adding the spree_fancy extension to our store to update the look and feel of the website, so let's go ahead and follow the next few steps:

Step 1: Update the Gemfile

We can find our Gemfile by going back to the terminal, and within the mystore directory, type ls to see a list of all the files and subdirectories within the Spree app. You will see the Gemfile there - open it using your favorite text editor. Add the following line to the last line of your Gemfile, and save it:
gem 'spree_fancy', :git => 'git://github.com/spree/spree_fancy.git', :branch => '2-1-stable'

Notice the branch it mentions is 2-1-stable. Since you just installed Spree, you are most likely using the latest version, 2-3-stable. I changed my branch in the above gem to '2-3-stable' to reflect the Spree version I'm currently using. After completing this step, run bundle install to install the gem using Bundler.

Now we need to copy over the migrations and assets from the spree_fancy extension by running this command in your terminal within your mystore application:

$ bundle exec rails g spree_fancy:install

Step 1.5: We've hit an error!

At this point, you've probably hit a LoadError, and we can no longer see our beautiful Spree demo app, instead getting an error page which says "Sprockets::Rails::Helper::AssetFilteredError in Spree::Home#index" at the top. How do we fix this?

Within your mystore application directory, navigate to config/intializers/assets.rb file and edit the last line of code by uncommenting it and typing:

Rails.application.config.assets.precompile += %w ( bx_loader.gif )

Now restart your server and you will see your new theme!

Step 2: Create a sales extension

Now let's see how to create an extension instead of using an existing one. According to the Spree tutorial, we first need to generate an extension - remember to run this command from a directory outside of your Spree application:

$ spree extension simple_sales
Once you do that, cd into your spree_simple_sales directory. Next, run bundle install to update your Spree extension.

Now you can create a migration that adds a sale_price column to variants using the following command:

bundle exec rails g migration add_sale_price_to_spree_variants sale_price:decimal

Once your migration is complete, navigate in your terminal to db/migrate/XXXXXXXXXXXX_add_sale_price_to_spree_variants.rb and add in the changes as shown in the Spree tutorial:

class AddSalePriceToSpreeVariants < ActiveRecord::Migration
  def change
    add_column :spree_variants, :sale_price, :decimal, :precision => 8, :scale => 2
  end
end
Now let's switch back to our mystore application so that we can add our extension before continuing any development. Within mystore, add the following to your Gemfile:
gem 'spree_simple_sales', :path => '../spree_simple_sales'
You will have to adjust the path ('../spree_simple_sales') depending on where you created your sales extension.

Now it's time to bundle install again, so go ahead and run that. Now we need to copy our migration by running this command in our terminal:

$ rails g spree_simple_sales:install

Step 3: Adding a controller Action to HomeController

Once the migration has been copied, we need to extend the functionality of Spree::HomeController and add an action that selects ?on sale? products. Before doing that, we need to make sure to change our .gemspec file within the spree_simple_sales directory (remember: this is outside of our application directory). Open up the spree_simple_sales.gemspec file in your text editor Add the following line to the list of dependencies:

s.add_dependency ?spree_frontend?

Run bundle.

Run $ mkdir -p app/controllers/spree to create the directory structure for our controller decorator. This is where we will create a new file called home_controller_decorator.rb and add the following content to it:

module Spree
  HomeController.class_eval do
    def sale
      @products = Product.joins(:variants_including_master).where('spree_variants.sale_price is not null').uniq
    end
  end
end

As Spree explains it, this script will select just the products that have a variant with a sale_price set.

Next step - add a route to this sales action in our config/routes.rb file. Make sure your routes.rb file looks like this:

Spree::Core::Engine.routes.draw do
  get "/sale" => "home#sale"
end

Let's set a sale price for the variant

Normally, to change a variant attribute, we could do it through the admin interface, but we haven?t created this functionality yet. This means we need to open up our rails console:
*Note - you should be in the mystore directory

Run $ rails console

The next steps are taken directly from the Spree documentation:

?Now, follow the steps I take in selecting a product and updating its master variant to have a sale price. Note, you may not be editing the exact same product as I am, but this is not important. We just need one ?on sale? product to display on the sales page.?

> product = Spree::Product.first
=> #<Spree::Product id: 107377505, name: "Spree Bag", description: "Lorem ipsum dolor sit amet, consectetuer adipiscing...", available_on: "2013-02-13 18:30:16", deleted_at: nil, permalink: "spree-bag", meta_description: nil, meta_keywords: nil, tax_category_id: 25484906, shipping_category_id: nil, count_on_hand: 10, created_at: "2013-02-13 18:30:16", updated_at: "2013-02-13 18:30:16", on_demand: false>

> variant = product.master
=> #<Spree::Variant id: 833839126, sku: "SPR-00012", weight: nil, height: nil, width: nil, depth: nil, deleted_at: nil, is_master: true, product_id: 107377505, count_on_hand: 10, cost_price: #<BigDecimal:7f8dda5eebf0,'0.21E2',9(36)>, position: nil, lock_version: 0, on_demand: false, cost_currency: nil, sale_price: nil>

> variant.sale_price = 8.00
=> 8.0

> variant.save
=> true

Hit Ctrl-D to exit the console.

Now we need to create the page that renders the product that is on sale. Let?s create a view to display these ?on sale? products.

Create the required views directory by running: $ mkdir -p app/views/spree/home

Create the a file in your new directory called sale.html.erb and add the following to it:

<%= render 'spree/shared/products', :products => @products %>

Now start your rails server again and navigate to localhost:3000/sale to see the product you listed on sale earlier! Exciting stuff, isn't it? The next step is to actually reflect the sale price instead of the original price by fixing our sales price extension using Spree Decorator.

Decorate your variant

Create the required directory for your new decorator within your mystore application: $ mkdir -p app/models/spree

Within your new directory, create a file called variant_decorator.rb and add:

module Spree
  Variant.class_eval do
    alias_method :orig_price_in, :price_in
    def price_in(currency)
      return orig_price_in(currency) unless sale_price.present?
      Spree::Price.new(:variant_id => self.id, :amount => self.sale_price, :currency => currency)
    end
  end
end

The original method of price_in now has an alias of price_in unless there is a sale_price present, in which case the sale price is returned on the product?s master variant.

In order to ensure that our modification to the core Spree functionality works, we need to write a couple of unit tests for variant_decorator.rb. We need a full Rails application present to test it against, so we can create a barebones test_app to run our tests against.

Run the following command from the root directory of your EXTENSION: $ bundle exec rake test_app

It will begin the process by saying ?Generating dummy Rails application?? - great! you?re on the right path.

Once you finish creating your dummy Rails app, run the rspec command and you should see the following output: No examples found.

Finished in 0.00005 seconds
0 examples, 0 failures

Now it?s time to start adding some tests by replicating your extension?s directory structure in the spec directory: $ mkdir -p spec/models/spree

In your new directory, create a file called variant_decorator_spec.rb and add this test:

require 'spec_helper'

describe Spree::Variant do
  describe "#price_in" do
    it "returns the sale price if it is present" do
      variant = create(:variant, :sale_price => 8.00)
      expected = Spree::Price.new(:variant_id => variant.id, :currency => "USD", :amount => variant.sale_price)

      result = variant.price_in("USD")

      result.variant_id.should == expected.variant_id
      result.amount.to_f.should == expected.amount.to_f
      result.currency.should == expected.currency
    end

    it "returns the normal price if it is not on sale" do
      variant = create(:variant, :price => 15.00)
      expected = Spree::Price.new(:variant_id => variant.id, :currency => "USD", :amount => variant.price)

      result = variant.price_in("USD")

      result.variant_id.should == expected.variant_id
      result.amount.to_f.should == expected.amount.to_f
      result.currency.should == expected.currency
    end
  end
end

Deface overrides

Next we need to add a field to our product admin page, so we don?t have to always go through the rails console to update a product?s sale_price. If we directly override the view that Spree provides, whenever Spree updates the view in a new release, the updated view will be lost, so we?d have to add our customizations back in to stay up to date.

A better way to override views is to use Deface, which is a Rails library to directly edit the underlying view file. All view customizations will be in ONE location: app/overrides which will make sure your app is always using the latest implementation of the view provided by Spree.

  1. Go to mystore/app/views/spree and create an admin/products directory and create the file _form.html.erb.
  2. Copy the full file NOT from Spree?s GitHub but from your Spree backend. You can think of your Spree backend as the area to edit your admin (among other things) - the spree_backend gem contains the most updated _form.html.erb - if you use the one listed in the documentation, you will get some Method Errors on your product page.

In order to find the _form.html.erb file in your spree_backend gem, navigate to your app, and within that, run the command: bundle show spree_backend

The result is the location of your spree_backend. Now cd into that location, and navigate to app/views/spree/admin/products - this is where you will find the correct _form.html.erb. Copy the contents of this file into the newly created _form.html.erb file within your application?s directory structure you just created: mystore/app/views/spree/admin/products.

Now we want to actually add a field container after the price field container for sale price so we need to create another override by creating a new file in your application?s app/overrides directory called add_sale_price_to_product_edit.rb and add the following content:

Deface::Override.new(:virtual_path => 'spree/admin/products/_form',
  :name => 'add_sale_price_to_product_edit',
  :insert_after => "erb[loud]:contains('text_field :price')",
  :text => "
    <%= f.field_container :sale_price do %>
      <%= f.label :sale_price, raw(Spree.t(:sale_price) + content_tag(:span, ' *')) %>
      <%= f.text_field :sale_price, :value =>
        number_to_currency(@product.sale_price, :unit => '') %>
      <%= f.error_message_on :sale_price %>
    <% end %>
  ")

The last step is to update our model in order to get an updated product edit form. Create a new file in your application?s app/models/spree directory called product_decorator.rb. Add the following content:

module Spree
  Product.class_eval do
    delegate_belongs_to :master, :sale_price
  end
end

Now you can check to see if it worked by heading to http://localhost:3000/admin/products and you should edit one of the products. Once you?re on the product edit page, you should see a new field container called SALE PRICE. Add a sale price in the empty field and click on update. Once completed, navigate to http://localhost:3000/sale to find an updated list of products on sale.

CONCLUSION

Congratulations, you've created the sales functionality! If you're using Spree 2.3 to create a sales functionality for your application, I would love to know what your experience was like. Good luck!


published by noreply@blogger.com (Josh Williams) on 2014-11-13 15:25:00 in the "ssl" category

The encryption times, they are a-changin'.

Every once in a while I'll take a look at the state of SNI, in the hopes that we're finally ready for putting it to wide-scale use.
It started a few years back when IPv6 got a lot of attention, though in reality very few end user ISP's had IPv6 connectivity at that time.  (And very few still do!  But that's another angry blog.)  So, essentially, IPv4 was still the only option, and thus SNI was still important.

Then earlier this year when Microsoft dropped [public] support for Windows XP.  Normally this is one of those things that would be pretty far off my radar, but Internet Explorer on XP is one of the few clients* that doesn't support SNI.  So at that time, with hope in my heart, I ran a search through the logs on a few of our more active servers, only to find that roughly 5% of the hits are MSIE on Windows XP.  So much for that.

(* Android < 3.0 has the same problem, incidentally. But it in contrast constituted 0.2% of the hits.  So I'm not as worried about the lack of support in that case.)

Now in fairly quick succession a couple other things have happened: SSLv3 is out, and SSL certificates with SHA-1 signatures are out.  This has me excited.  I'll tell you why in a moment.

First, now that I've written "SNI" four times at this point I should probably tell you that it stands for Server Name Indication, and basically means the client sends the intended server name very early in the connection process.  That gives the server the opportunity to select the correct certificate for the given name, and present it to the client.

If at this point you're yelling "of course!" at the screen, press Ctrl-F and search for "SSLv3" below.

For the rest, pull up a chair, it's time for a history lesson.

Back in the day, when a web browser wanted to connect to a site it performed a series of steps: it looks up the given name in DNS and gets an IP address, connects to that IP address, and then requests the path in the form of "GET /index.html".  Quite elegant, fairly straightforward.  And notice the name only matters to the client, as it uses it for the DNS look-up.  To the server it matters not at all.  The server accepts a connection on an IP address and responds to the request for a specific path.

A need arose for secure communication.  Secure Socket Layer (SSL) establishes an encrypted channel over which private information can be shared.  In order to fight man-in-the-middle attacks, a certificate exchange takes place. When the connection is made the server offers up the certificate and the client (among other things I'm glossing over) confirms the name on the certificate matches the name it thinks it tried to connect to.  Note that the situation is much the same as above, in that the client cares about the name, but the server just serves up what it has associated with the IP address.

Around the same time, the Host header appears.   Finally the browser has a way to send the name of the site it's trying to access over to the server; it just makes it part of the request.  What a simple thing it is, and what a world that opens up on the server side.  Instead of associating a single site per IP address, a single web server listening on one address can examine the Host header and serve up a virtually unlimited number of completely distinct sites.

However there's a problem.  At the time, both were great advances.  But, unfortunately, were mutually exclusive.  SSL is established first, immediately upon connection, and after which the HTTP communication happens over the secure channel.  But the Host header is part of the HTTP request.  Spot the problem yet?  The server has to serve up a certificate before the client has an opportunity to tell the server what site it wants.  If the server has multiple and it serves up the wrong one, the name doesn't match what the client expects, and at best it displays a big, scary warning to the end user, or at worst refuses to continue with communication.

There's a few work-arounds, but this is already getting long and boring.  If you're really curious, search for "Subject Alternate Name" and try to imagine why it's an inordinately expensive solution when the list of sites a server needs to support changes.

So for a long time, that was the state of things.  And by a long time, I mean almost 20 years, from when these standards were released.  In computing, that's a long time.  Thus I'd hoped that by now, SNI would be an option as the real solution.

Fast forward to the last few months.  We've had the news that SSLv3 isn't to be trusted, and the news that SHA-1 signatures on SSL certificates are pretty un-cool.  SHA-256 is in, and major CA's are now using that signature by default.  Why does this have me excited?  In the case of the former, Windows XP pre-SP3 only supports up to SSLv3, so any site that's mitigated the POODLE vulnerability is already excluding these clients.  Similarly pre-IE8 clients are excluded by sites that have implemented SHA-2 certificates in the case of the latter.

Strictly speaking we're not 100% there, as a fully up-to-date Internet Explorer on Windows XP is still compatible with these recent SSL ecosystem changes.  But the sun is setting on this platform, and maybe soon we'll be able to start putting this IPv4-saving technology into use.

Soon.


published by noreply@blogger.com (Joshua Tolley) on 2014-11-12 23:33:00 in the "postgres" category

From Flickr user Jitze Couperus

When debugging a problem, it's always frustrating to get sidetracked hunting down the relevant logs. PostgreSQL users can select any of several different ways to handle database logs, or even choose a combination. But especially for new users, or those getting used to an unfamiliar system, just finding the logs can be difficult. To ease that pain, here's a key to help dig up the correct logs.

Where are log entries sent?

First, connect to PostgreSQL with psql, pgadmin, or some other client that lets you run SQL queries, and run this:
foo=# show log_destination ;
 log_destination 
-----------------
 stderr
(1 row)
The log_destination setting tells PostgreSQL where log entries should go. In most cases it will be one of four values, though it can also be a comma-separated list of any of those four values. We'll discuss each in turn.

SYSLOG

Syslog is a complex beast, and if your logs are going here, you'll want more than this blog post to help you. Different systems have different syslog daemons, those daemons have different capabilities and require different configurations, and we simply can't cover them all here. Your syslog may be configured to send PostgreSQL logs anywhere on the system, or even to an external server. For your purposes, though, you'll need to know what "ident" and "facility" you're using. These values tag each syslog message coming from PostgreSQL, and allow the syslog daemon to sort out where the message should go. You can find them like this:
foo=# show syslog_facility ;
 syslog_facility 
-----------------
 local0
(1 row)

foo=# show syslog_ident ;
 syslog_ident 
--------------
 postgres
(1 row)
Syslog is often useful, in that it allows administrators to collect logs from many applications into one place, to relieve the database server of logging I/O overhead (which may or may not actually help anything), or any number of other interesting rearrangements of log data.

EVENTLOG

For PostgreSQL systems running on Windows, you can send log entries to the Windows event log. You'll want to tell Windows to expect the log values, and what "event source" they'll come from. You can find instructions for this operation in the PostgreSQL documentation discussing server setup.

STDERR

This is probably the most common log destination (it's the default, after all) and can get fairly complicated in itself. Selecting "stderr" instructs PostgreSQL to send log data to the "stderr" (short for "standard error") output pipe most operating systems give every new process by default. The difficulty is that PostgreSQL or the applications that launch it can then redirect this pipe to all kinds of different places. If you start PostgreSQL manually with no particular redirection in place, log entries will be written to your terminal:
[josh@eddie ~]$ pg_ctl -D $PGDATA start
server starting
[josh@eddie ~]$ LOG:  database system was shut down at 2014-11-05 12:48:40 MST
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
LOG:  statement: select syntax error;
ERROR:  column "syntax" does not exist at character 8
STATEMENT:  select syntax error;
In these logs you'll see the logs from me starting the database, connecting to it from some other terminal, and issuing the obviously erroneous command "select syntax error". But there are several ways to redirect this elsewhere. The easiest is with pg_ctl's -l option, which essentially redirects stderr to a file, in which case the startup looks like this:
[josh@eddie ~]$ pg_ctl -l logfile -D $PGDATA start
server starting
Finally, you can also tell PostgreSQL to redirect its stderr output internally, with the logging_collector option (which older versions of PostgreSQL named "redirect_stderr"). This can be on or off, and when on, collects stderr output into a configured log directory.

So if you end see a log_destination set to "stderr", a good next step is to check logging_collector:
foo=# show logging_collector ;
 logging_collector 
-------------------
 on
(1 row)
In this system, logging_collector is turned on, which means we have to find out where it's collecting logs. First, check log_directory. In my case, below, it's an absolute path, but by default it's the relative path "pg_log". This is relative to the PostgreSQL data directory. Log files are named according to a pattern in log_filename. Each of these settings is shown below:
foo=# show log_directory ;
      log_directory      
-------------------------
 /home/josh/devel/pg_log
(1 row)

foo=# show data_directory ;
       data_directory       
----------------------------
 /home/josh/devel/pgdb/data
(1 row)

foo=# show log_filename ;
          log_filename          
--------------------------------
 postgresql-%Y-%m-%d_%H%M%S.log
(1 row)
Documentation for each of these options, along with settings governing log rotation, is available here.

If logging_collector is turned off, you can still find the logs using the /proc filesystem, on operating systems equipped with one. First you'll need to find the process ID of a PostgreSQL process, which is simple enough:
foo=# select pg_backend_pid() ;
 pg_backend_pid 
----------------
          31950
(1 row)
Then, check /proc/YOUR_PID_HERE/fd/2, which is a symlink to the log destination:
[josh@eddie ~]$ ll /proc/31113/fd/2
lrwx------ 1 josh josh 64 Nov  5 12:52 /proc/31113/fd/2 -> /var/log/postgresql/postgresql-9.2-local.log

CSVLOG

The "csvlog" mode creates logs in CSV format, designed to be easily machine-readable. In fact, this section of the PostgreSQL documentation even provides a handy table definition if you want to slurp the logs into your database. CSV logs are produced in a fixed format the administrator cannot change, but it includes fields for everything available in the other log formats. For these to work, you need to have logging_collector turned on; without logging_collector, the logs simply won't show up anywhere. But when configured correctly, PostgreSQL will create CSV format logs in the log_directory, with file names mostly following the log_filename pattern. Here's my example database, with log_destination set to "stderr, csvlog" and logging_collector turned on, just after I start the database and issue one query:
[josh@eddie ~/devel/pg_log]$ ll
total 8
-rw------- 1 josh josh 611 Nov 12 16:30 postgresql-2014-11-12_162821.csv
-rw------- 1 josh josh 192 Nov 12 16:30 postgresql-2014-11-12_162821.log
The CSV log output looks like this:
[josh@eddie ~/devel/pg_log]$ cat postgresql-2014-11-12_162821.csv 
2014-11-12 16:28:21.700 MST,,,2993,,5463ed15.bb1,1,,2014-11-12 16:28:21 MST,,0,LOG,00000,"database system was shut down at 2014-11-12 16:28:16 MST",,,,,,,,,""
2014-11-12 16:28:21.758 MST,,,2991,,5463ed15.baf,1,,2014-11-12 16:28:21 MST,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,""
2014-11-12 16:28:21.759 MST,,,2997,,5463ed15.bb5,1,,2014-11-12 16:28:21 MST,,0,LOG,00000,"autovacuum launcher started",,,,,,,,,""
2014-11-12 16:30:46.591 MST,"josh","josh",3065,"[local]",5463eda6.bf9,1,"idle",2014-11-12 16:30:46 MST,2/10,0,LOG,00000,"statement: select 'hello, world!';",,,,,,,,,"psql"

published by noreply@blogger.com (Greg Sabino Mullane) on 2014-11-10 22:07:00 in the "git" category

When using git, being able to track down a particular version of a file is an important debugging skill. The common use case for this is when someone is reporting a bug in your project, but they do not know the exact version they are using. While normal software versioning resolves this, bug reports often come in from people using the HEAD of a project, and thus the software version number does not help. Finding the exact set of files the user has is key to being able to duplicate the bug, understand it, and then fix it.

How you get to the correct set of files (which means finding the proper git commit) depends on what information you can tease out of the user. There are three classes of clues I have come across, each of which is solved a different way. You may be given clues about:

  1. Date: The date they downloaded the files (e.g. last time they ran a git pull)
  2. File: A specific file's size, checksum, or even contents.
  3. Error: An error message that helps guide to the right version (especially by giving a line number)

Finding a git commit by date

This is the easiest one to solve. If all you need is to see how the repository looked around a certain point in time, you can use git checkout with git-rev-parse to get it. I covered this in detail in an earlier post, but the best answer is below. For all of these examples, I am using the public Bucardo repository at git clone git://bucardo.org/bucardo.git

$ DATE='Sep 3 2014'
$ git checkout `git rev-list -1 --before="$DATE" master`
Note: checking out '79ad22cfb7d1ea950f4ffa2860f63bd4d0f31692'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 79ad22c... Need to update validate_sync with new columns

Or if you prefer xargs over backticks:

$ DATE='Sep 3 2014'
$ git rev-list -1 --before="$DATE" master | xargs -Iz git checkout z

What about the case in which there were multiple important commits on the given day? If the user doesn't know the exact time, you will have to make some educated guesses. You might add the -p flag to git log to examine what changes were made and how likely they are to interact with the bug in question. If it is still not clear, you may just want to have the user mail you a copy or a checksum of one of the key files, and use the method below.

Once you have found the commit you want, it's a good idea to tag it right away. This applies to any of the three classes of clues in this article. I usually add a lightweight git tag immediately after doing the checkout. Then you can easily come back to this commit simply by using the name of the tag. Give it something memorable and easy, such as the bug number being reported. For example:

$ git checkout `git rev-list -1 --before="$DATE" master`
## Give a lightweight tag to the current commit
$ git tag bug_23142
## We need to get back to our main work now
$ git checkout master
## Later on, we want to revisit that bug
$ git checkout bug_23142
## Of course, you may also want to simply create a branch

Finding a git commit by checksum, size, or exact file

Sometimes you can find the commit you need by looking for a specific version of an important file. One of the "main" files in the repository that changes often is your best bet for this. You can ask the user for the size, or just a checksum of the file, and then see which repository commits have a matching entry.

Finding a git commit when given a checksum

As an example, a user in the Bucardo project has encountered a problem when running HEAD, but all they know is that they checked it out of sometime in the last four months. They also run "md5sum Bucardo.pm" and report that the MD5 of the file Bucardo.pm is 767571a828199b6720f6be7ac543036e. Here's the easiest way to find what version of the repository they are using:

$ SUM=767571a828199b6720f6be7ac543036e
$ git log --format=%H 
  | xargs -Iz sh -c 
    'echo -n "z "; git show z:Bucardo.pm | md5sum' 
  | grep -m1 $SUM 
  | cut -d " " -f 1 
  | xargs -Iz git log z -1
xargs: sh: terminated by signal 13
commit b462c256e62e7438878d5dc62155f2504353be7f
Author: Greg Sabino Mullane 
Date:   Fri Feb 24 08:34:50 2012 -0500

    Fix typo regarding piddir

I'm using variables in these examples both to make copy and paste easier, and because it's always a good idea to save away constant but hard-to-remember bits of information. The first part of the pipeline grabs a list of all commit IDs: git log --format=%H.

We then use xargs to feed list of commit ids one by one to a shell. The shell grabs a copy of the Bucardo.pm file as it existed at the time of that commit, and generates an MD5 checksum of it. We echo the commit on the line as well as we will need it later on. So we now generate the commit hash and the md5 of the Bucardo.pm file.

Next, we pipe this list to grep so we only match the MD5 we are looking for. We use -m1 to stop processing once the first match is found (this is important, as the extraction and checksumming of files is fairly expensive, so we want to short-circuit it as soon as possible). Once we have a match, we use the cut utility to extract just the commit ID, and pipe that back into git log. Voila! Now we know the very last time the file existed with that MD5, and can checkout the given commit. (The "terminated by signal 13" is normal and expected)

You may wonder if a sha1sum would be better, as git uses those internally. Sadly, the process remains the same, as the algorithm git uses to generate its internal SHA1 checksums is sha1("blob " . length(file) . "" . contents(file)), and you can't expect a random user to compute that and send it to you! :)

Finding a git commit when given a file size

Another piece of information the user can give you very easily is the size of a file. For example, they may tell you that their copy of Bucardo.pm weighs in at 167092 bytes. As this file changes often, it can be a unique-enough marker to help you determine when they checkout out the repository. Finding the matching size is a matter of walking backwards through each commit and checking the file size of every Bucardo.pm as it existed:

$ SIZE=167092
$ git rev-list --all 
  | while read commit
 do if git ls-tree -l -r $commit 
  | grep -q -w $SIZE
 then echo $commit
 break
 fi
 done
d91807d59a6326e48077311e96e4d5730f24304c

The git ls-tree command generates a list of all blobs (files) for a given commit. The -l option tells it to also print the file size, and the -r option asks it to recurse. So we use git rev-list to generate a list of all the commits (by default, these are output from newest to oldest). Then we pass each commit to the ls-tree command, and use grep to see if that number appears anywhere in the output. If it does, grep returns truth, making the if statement fire the echo, which shows is the commit. The break ensures we stop after the first match. We now have the (probable) commit that the user checked the file out of. As we are not matching by filename, it's probably a good idea to double-check by running git ls-tree -l -r on the given commit.

Finding a git commit when given a copy of the file itself

This is very similar to the size method above, except that we are given the file itself, not the size, so we need to generate some metadata about it. You could run a checksum or a filesize and use one of the recipes above, or you could do it the git way and find the SHA1 checksum that git uses for this file (aka a blob) by using git hash-object. Once you find that, you can use git ls-tree as before, as the blob hash is listed next to the filename. Thus:

$ HASH=`git hash-object ./bucardo.clue`
$ echo $HASH
639b247aab027b79bda788182c8b6785ed319662
$ git rev-list --all 
  | while read commit
 do if git ls-tree -r $commit 
  | grep -F -q $HASH
 then echo $commit
 break
 fi
 done
cd1d776307204cb77a731aa1b15c3c43a275c70e

Finding a git commit by error message

Sometimes the only clue you are given is an error message, or some other snippet that you can trace back to one or more commits. For example, someone once mailed the list to ask about this error that they received:

DBI connect('dbname=bucardo;host=localhost;port=5432',
  'bucardo',...) failed: fe_sendauth: no password supplied at 
  /usr/local/bin/bucardo line 8627.

A quick glance at line 8627 of the file "bucardo" in HEAD showed only a closing brace, so it must be an earlier version of the file. What was needed was to walk backwards in time and check that line for every commit until we find one that could have triggered the error. Here is one way to do that:

$ git log --format=%h 
  | xargs -n 1 -I sh -c 
  "echo -n {}; git show {}:bucardo | head -8627 | tail -1" 
  | less
## About 35 lines down:
379c9006     $dbh = DBI->connect($BDSN, 'bucardo'...

Therefore, we can do a "git checkout 379c9006" and see if we can solve the user's problem.

These are some of the techniques I use to hunt down specific commits in a git repository. Are there other clues you have run up against? Better recipes for hunting down commits? Let me know in the comments below.