Saturday 24 November 2012

Django, uwsgi, nginx and virtualenv

Bring on the Pain - just kidding!


There are lots of tutorials online for getting the Python Django web framework and the nginx web server working together. However, some of them skip over specific details of what to do and automatically assume you're a Linux God, able to configure anything hi-tech using only your little finger. Because of this, I've decided to create my own tutorial for doing this with Python 2.7x and a virtualenv.

One part of getting things to work together is the pain of dealing with uWSGI. uWSGI calls itself an "application container server coded in pure C", which deals with the protocols for communicating between the web server (nginx) and your Python framework (Django). What it really is, in fact, is a great WSGI interface written by talented hackers which is, however, rather user-unfriendly.

In addition, most documentation for Django for using uwsgi seems slightly out of date, and uwsgi's own documentation is skimpy and really only hints at what you can do, without practically helping much, especially for those of us whose skills don't include hardcore Linux administration. For example, it took me ages to realize that the options to put in a uwsgi ini file are the same as those displayed when you type "uwsgi --help". Also, I have only a basic knowledge of Unix sockets, and so I got quite stuck.

So I thought, I'll address this! Here is a quick, get-you-up-and-running guide for deploying with Django, uwsgi, nginx and a virtualenv. (This is not a guide to teach you the basics of any of those four things. Only to configure them so you get a web app running!) This guide was tested out in Mint 12 and Ubuntu Server 12.04. It is for a Django project "progress_recorder" that will be stored in my /var/www/py directory - you would replace the name and location with those of your own project.


Steps


1) Install the pip and virtualenv packages for Python
sudo apt-get install python-pip
pip install virtualenv


2) Make a virtualenv for a Python version (and activate)
cd /opt
virtualenv py273
source /opt/py273/bin/activate


3) Install Django into your virtualenv
pip install django
# Check version
python -c "import django; print(django.get_version())"
1.4.2


4) Install uwsgi into your virtualenv
pip install uwsgi
# Check version
/opt/py273/bin/uwsgi --version
1.4.1


5) Install nginx
sudo apt-get install nginx
# Check version
nginx -V
nginx: nginx version: nginx/1.1.19

 # Check it's running
service nginx start
ps aux | grep nginx

- you should see "nginx: master process" in the output somewhere.

6) Make a Django project
-Get the page views/templates all displaying with the test server (python manage.py runserver).
Here is the structure of my project, named "progress_recorder", in a tree outline:

progress_recorder
├── manage.py
└── progress_recorder
    ├── django.ini
    ├── django_wsgi.py
    ├── __init__.py
    ├── settings.py
    ├── templates
    │   ├── entry.html
    │   ├── index.html
    │   └── __init__.py
    ├── urls.py
    └── views
        ├── entry.py
        ├── index.py
        ├── __init__.py


6.a) Set your templates in settings.py      
TEMPLATE_DIRS = (
    '/var/www/py/progress_recorder/progress_recorder/templates',)
       
   
6.b) Set your url patterns in urls.py
urlpatterns = patterns('',
    url(r'^progress_recorder/entry$', 'views.entry.entry'),
    url(r'^progress_recorder$', 'views.index.index'),
)


       
7) Get Django working with uwsgi

7.a) Make a django_wsgi.py file:
vim django_wsgi.py

#!/usr/bin/env python

import os

import django.core.handlers.wsgi

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "progress_recorder.settings")
application = django.core.handlers.wsgi.WSGIHandler()



7.b) Make a uwsgi ini file, named django.ini:
vim django.ini

[uwsgi]
pythonpath = /opt/py273/bin/python
virtualenv = /opt/py273
# set the http port
http = :8000
# change to django project directory
chdir = /var/www/py/progress_recorder/progress_recorder
# load django
module = django_wsgi:application
env = DJANGO_SETTINGS_MODULE=settings


7.c) Launch uwsgi
/opt/py273/bin/uwsgi --ini django.ini

7.d) In your browser, test uwsgi is working with Django
-Go to:
http://localhost:8000/progress_recorder
-you should see a Django template view displayed via uWSGI itself acting as a web server.

7.e) Stop uwsgi
Ctrl + c
Check this has worked with:
ps aux | grep uwsgi
-if you see any uwsgi processes still running, for each process ID:
kill -9 [processid]

8) Get nginx working with uwsgi

8.a) Check the nginx user's user name
head /etc/nginx/nginx.conf
-you should see in there:
user www-data;

8.b) Make a project configuration file for nginx
(uwsgi support is built into nginx)

cd /etc/nginx/sites-enabled
vim progress_recorder.conf

server {
    listen        80;
    server_name     localhost;   
    location / {
        root    /var/www/py/progress_recorder/progress_recorder;
        uwsgi_pass unix:///tmp/progress_recorder.sock;
        uwsgi_modifier1 30;
        include uwsgi_params;
        #autoindex on;
    }
}


-Save the project config file.

8.c) Alter the Django project's uwsgi ini file to use a socket
-remove the http setting, and insert a new line instead.

[uwsgi]
pythonpath = /opt/py273/bin/python
virtualenv = /opt/py273
# New
socket = /tmp/progress_recorder.sock

# change to django project directory
chdir = /var/www/py/progress_recorder/progress_recorder
# load django
module = django_wsgi:application
env = DJANGO_SETTINGS_MODULE=settings



8.d) Start uwsgi as the nginx user, and check that socket is picked up
sudo su - www-data
(Enter your password)
/bin/bash (to get a decent bash shell)
cd [project_directory], i.e.:
cd /var/www/py/progress_recorder/progress_recorder
/opt/py273/bin/uwsgi --ini django.ini


-you should see something like this in the output:
wsgi socket 0 bound to UNIX address /tmp/progress_recorder.sock fd 3

8.e) Reload nginx with the new configurations
service nginx reload

8.f) Check that nginx and uwsgi are playing nicely
-In your browser, go to:
http://localhost/progress_recorder
-you should see the Django template view displayed via both nginx and uWSGI.

Congratulations!

Friday 28 September 2012

Quick Sphinx documentation for Python

Intro

Hi everyone, it's been far too long since my last post. I've been busy doing lots of things, including finding a new place to live and packing. I've also been listening to a lot of music while programming and tinkering. (If you're interested: rock music by Rush, industrial electronica by Skinny Puppy, and folk by a German band named Faun. All three of these bands are fantastic for coding!)

The Riddle of the Sphinx

Sphinx is a great documentation creation module for Python, which uses reStructuredText files to generate docs in whatever format you like (usually HTML). reStructuredText (.rst) files look weird, but they're quite easy to write once you learn how. Sphinx uses the DocUtils module, which has a handy program "rest2html.py" that can convert a .rst file straight to an HTML file. Sphinx also has an autodoc/automodule feature, which can read the docstrings in your source code as reStructuredText and turn them into attractive HTML docs too. And my favourite feature, the "sphinx-apidoc" script, can walk through all your project's subdirectories and source files , and suck any docstrings from every single Python script it finds, automatically.

So here are a few tips about getting going quickly with Sphinx documentation for Python. I need to install Sphinx first:

$ pip install sphinx

I have a little project named "sphinxy" which contains a structure like this:

$ tree sphinxy
sphinxy
├── docs
└── sphinxy
    ├── big.py
    └── package1
        ├── __init__.py
        ├── small1.py
        └── small2.py


The file "big.py" looks like this (and the other .py files are similar):

#!/usr/bin/env python
#-*- coding: utf-8 -*-


"""
    big
    ~~~
    This module contains functions that are big.
   
    :copyright: (c) 2012 by Scott Davies for Web Drive Ltd.
"""


import package1.small1 as small1
from package1.small2 import Smallish, small2_cool, small2_rebel


def big_func1(abra, cadabra, sesame=None):
    """This is about big_func1.
   
    :param abra: abra number.
    :param cadabra: cadabra number.
    :param sesame: sesame information.
    :return ret_val: result information.
    :rtype: int.
    """
    ret_val = abra + cadabra
    print(sesame["street"])
    return ret_val



if __name__ == "__main__":
    ret_val = big_func1(4, 5, sesame={"street": "adding up"})

...

Notice some strange "~" and ":" characters in the docstrings? That's reStructuredText. Now what I want to do is just get Sphinx to read all the .py files in my project, and spin together some HTML documentation for me. (Since I can't be bothered typing custom .rst files and setting "automodule" everywhere.) So I run the "sphinx-apidoc" script, and want to put the new documentation generated in the "docs" directory.

$ cd ~/code/py/sphinxy
 $ sphinx-apidoc -F -o docs sphinxy
Creating file docs/big.rst.
Creating file docs/package1.rst.
Creating file docs/conf.py.
Creating file docs/index.rst.
Creating file docs/Makefile.
Creating file docs/make.bat.


If I go into my "docs" directory now, I can see a bunch of .rst files (and other things to help them) in there.

$ cd docs; ls
big.rst  conf.py    make.bat  package1.rst  _templates
_build   index.rst  Makefile  _static


The index page, i.e. the root/start page of my new documentation, is called "index.rst". If I have a look at it, the important parts are like this (a table of contents tree):

...
Welcome to sphinxy's documentation!
===================================

Contents:

.. toctree::
   :maxdepth: 4

   big
   package1

...

That's what I wanted! Yahoo. What I need to do now is convert those .rst files into my desired format, HTML. For that, I'm going to have to edit a setting in the "conf.py" file, so that Sphinx can find the directory I want it to start running from. I'll uncomment the line that adds to the sys.path for Sphinx autodoc to work:

$ vim conf.py

 I change this line, and save:

sys.path.insert(0, os.path.abspath('../../sphinxy'))

Then I create my HTML docs!

$ make html
sphinx-build -b html -d _build/doctrees   . _build/html
Making output directory...
Running Sphinx v1.1.3
loading pickled environment... not yet created
building [html]: targets for 3 source files that are out of date
updating environment: 3 added, 0 changed, 0 removed
reading sources... [100%] package1                                                     
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] package1                                                      
writing additional files... (3 module code pages) _modules/index
 genindex py-modindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.


Build finished. The HTML pages are in _build/html.



I will find my lovely new pages in the _build/html directory:

$ cd _build/html; ls
big.html       index.html  objects.inv    py-modindex.html  searchindex.js  _static
genindex.html  _modules    package1.html  search.html       _sources


Here is the index.html file viewed in a browser:


Here is the file big.html in a browser:




Make sure you read the online structuredText tutorials available, to help you in future. Here's a quick example of editing a .rst file manually, which you'll understand if you've read a few tutorials. This is part of a file I wrote to document a web services API, for calling a URL "init":

:init:

Initialises a workflow by attempting to save a non-template workflow to the database.
It will retrieve workflow template information for the workflow name and populate the new workflow with the information. It will also populate these workflow fields with initial default values: current_step_task, template_id, status, state.

**Parameters:**   
    * **name** - the name of a workflow as a string.
   
    i.e. send a JSON request in this format:

    ::

        {
            "name": unicode(50)
        }
   
**Returns:**
    Inside the JSON response dict:
        * **errors** - a list of any errors (strings) that may have occurred.
        * **confirmations** - a list of any confirmations that may have occurred.
        * **workflow_id** - the ObjectId string of the new workflow.
    e.g.
   
    ::
   
        {
            "errors": [],
            "confirmations: ["Successfully initialised workflow. "]
            "workflow_id": "4ea5e2afcd8bbf1491000005"
        }
       
**Usage example:**

::

    base_url = "http://asapweb3.local.org.nz:9003/approvals/"
    request_dic = {"name": "Purchase Order Workflow}
    req = urllib2.Request(url=base_url + "init",
                          data=json.dumps(request_dic),
                          headers={"Content-Type": "application/json"}
                        )
    opener = urllib2.build_opener()
    resp = opener.open(req)
    json_resp_strg = resp.read()
    # Get the info from the json string
    json_resp = json.loads(json_resp_strg)  
     



And here is the HTML page result via Sphinx (yay!):


Till next time, fellow Sphinxians. :)

Saturday 7 July 2012

Pymongo and Datetimes for MongoDB

You might have read in an earlier post how I recommended storing datetime values in a database with a UTC timezone set. Well, I thought I'd go over doing this with a real database. MongoDB is quite a trendy NoSQL database at the moment, and one Python module you can use for accessing a MongoDB database is PyMongo.

Installing


Follow the steps on the MongoDB website for installing it in your OS (I'm only going through this for Linux, particularly Ubuntu). Once you've updated and altered your /etc/apt/sources.list, install it!
sudo apt-get install mongodb

Then check it's running. Hopefully, you should see an OK message.
sudo service mongodb status

Activate your virtualenv for your Python version, then install Pymongo.
source /pathtomyenv/bin/activate
pip install pymongo

I just want to sort out timezones!


OK, so here's a bit of background. Pymongo makes some helpful assumptions for you when dealing with timezone information for datetimes. It assumes you want to store datetimes in a MongoDB database with UTC timezone value, and will convert things automatically for you (although there are some options you can set). Here's a little diagram that hopefully illustrates the main ideas:



So let's see what happens when we create a datetime object with a local timezone set to it, and send it to MongoDB. I'm making a database named "school_info", with a collection inside it named "students". I want to insert a document for a student with a datetime object as a field value (for "birth_date"). I'm going to put a complete Python source listing at the end for you, but here is the relevant part first:

tzone = {"GMT": Zone(0, False, "GMT"),
            "NZST": Zone(12, False, "NZST"),
            "NZDT": Zone(12, True, "NZDT")
            }
student = {"first_name": "Jane",
    "birth_date": datetime(2003, 3, 17, tzinfo=tzone["NZDT"]),
}
...
# Open connection to MongoDB
conn = Connection()
# Database is named "school_info".
db = conn.school_info
# Insert into the "students" collection
db.students.insert(student)

This snippet will add the student record to the students collection, with the datetime value adjusted from local "NZDT" timezone to a "UTC" timezone value. When we retrieve the student document again from MongoDB, we should be able to see this:
student_check = db.students.find_one({"first_name": "Jane"})
print(student_check["birth_date"].strftime("%d/%m/%Y %H:%M:%S, timezone: %Z"))

But - the output is this!
16/03/2003 11:00:00, timezone:

-Note how there is no timezone assigned! It's a naive representation of a datetime WITHOUT its UTC timezone. The datetime value has gone back by the UTC offset value for the local timezone, which was +12+1 hours for NZDT. What makes this worse is that we can't convert this naive datetime (i.e. having no timezone assigned to it) to a local timezone-aware datetime.

The solution: try giving the retrieved datetime object a UTC timezone first, then try converting it to a local timezone value.
utc_b_date = student_check["birth_date"].replace(tzinfo=tzone["GMT"])

Then it SHOULD convert back to local the timezone:
local_b_date = utc_b_date.astimezone(tzone["NZDT"])
print(local_b_date.strftime("%d/%m/%Y %H:%M:%S, timezone: %Z"))

The output:
17/03/2003 00:00:00, timezone: NZDT

Here is the full code for this datetime/timezone storage example, as promised. (Scroll to the end of this example, and you'll see a further example which is a unittest to prove this is the way things happen.)

#!/usr/bin/env python


"""A script to show default MongoDB storage of datetime and timezone values if you don't calculate and explicitly set UTC values on datetimes before storing them. """


from datetime import datetime, tzinfo, timedelta

from pymongo import Connection


class Zone(tzinfo):
    """ Sets some properties for the tzinfo abstract class. """
    def __init__(self, offset, isdst, name):
        self.offset = offset
        self.isdst = isdst
        self.name = name
       
       
    def utcoffset(self, dt):
        return timedelta(hours=self.offset) + self.dst(dt)
       
       
    def dst(self, dt):
        if self.isdst:
            dst = timedelta(hours=1)
        else:
            dst = timedelta(0)
        return dst
           
           
    def tzname(self, dt):
         return self.name


if __name__ == "__main__":
    tzone = {"GMT": Zone(0, False, "GMT"),
            "NZST": Zone(12, False, "NZST"),
            "NZDT": Zone(12, True, "NZDT")
            }
    display_patn = "%d/%m/%Y %H:%M:%S, timezone: %Z"

    student = {"first_name": "Jane",
        "birth_date": datetime(2003, 3, 17, tzinfo=tzone["NZDT"]),
    }
    print("The student will be added with this birth_date:")
    print(student["birth_date"].strftime(display_patn))
    msg = """-Note: this will be stored by Pymongo in MongoDB automatically with
a UTC timezone\n"""
    print(msg)
   
    # Open connection to MongoDB
    conn = Connection()
    # Database is named "school_info". Clear the students collection first.
    db = conn.school_info
    db.students.drop()
    # Insert into the "students" collection
    db.students.insert(student)

    student_check = db.students.find_one({"first_name": "Jane"})
    print("When the student is retrieved from MongoDB, the birth_date is:")
    print(student_check["birth_date"].strftime(display_patn))
    msg = """-Note how there is no timezone assigned! It's a naive representation of a datetime WITHOUT its UTC timezone.\n"""
    print(msg)
   
    print("Try converting the retrieved value to the local timezone:")
    try:
        local_b_date = student_check["birth_date"].astimezone(
tzone["NZDT"])
    except ValueError as e:
        msg = """--> Error: {0}.\n-This causes an error because a naive datetime can't be converted to a timezoned datetime.\n""".format(str(e))
        print(msg)

    print("Now, try giving the retrieved value a UTC timezone first.")
    utc_b_date = student_check["birth_date"].replace(
tzinfo=tzone["GMT"])
    print("Then it SHOULD convert back to local the timezone.")
    local_b_date = utc_b_date.astimezone(tzone["NZDT"])
    print(local_b_date.strftime(display_patn))

The output:
The student will be added with this birth_date:
17/03/2003 00:00:00, timezone: NZDT
-Note: this will be stored by Pymongo in MongoDB automatically with a UTC timezone

When the student is retrieved from MongoDB, the birth_date is:
16/03/2003 11:00:00, timezone:
-Note how there is no timezone assigned! It's a naive representation of a datetime WITHOUT its UTC timezone.

Try converting the retrieved value to the local timezone:
--> Error: astimezone() cannot be applied to a naive datetime.
-This causes an error because a naive datetime  can't be converted to a timezoned datetime.

Now, try giving the retrieved value a UTC timezone first.
Then it SHOULD convert back to local the timezone.
17/03/2003 00:00:00, timezone: NZDT

Let's Get Testing


If we take a local-timezoned datetime, and convert it to a UTC-timezoned datetime, that UTC datetime value should match what MongoDB stores and returns to us later, right? Right! So let's prove it with a unit test case. We can pull information back out of MongoDB, then use operations to establish the local and UTC timezones calculated upon our datetime values.


#!/usr/bin/env python


"""A simple script to test datetime timezone storage, retrieval and conversion for Pymongo."""


import unittest
from datetime import datetime, tzinfo, timedelta

from pymongo import Connection


class Zone(tzinfo):
    """ Sets some properties for the tzinfo abstract class. """
    def __init__(self, offset, isdst, name):
        self.offset = offset
        self.isdst = isdst
        self.name = name
       
       
    def utcoffset(self, dt):
        return timedelta(hours=self.offset) + self.dst(dt)
       
       
    def dst(self, dt):
        if self.isdst:
            dst = timedelta(hours=1)
        else:
            dst = timedelta(0)
        return dst
           
           
    def tzname(self, dt):
         return self.name


class Tester(unittest.TestCase):


    def setUp(self):
        # Open connection to MongoDB
        conn = Connection()
        # Database is named "school_info"
        self.db = conn.school_info
       
       
    def tearDown(self):
        # Clear test data
        self.db.students.drop()
       
   
    def test_local_and_utc(self):
        """Test case! """
        tzone = {"GMT": Zone(0, False, "GMT"),
                "NZST": Zone(12, False, "NZST"),
                "NZDT": Zone(12, True, "NZDT")
                }
        display_patn = "%d/%m/%Y %H:%M:%S, timezone: %Z"

        # Local client timezone datetime
        local_b_date = datetime(2003, 3, 17, tzinfo=tzone["NZDT"])
        # Convert to a UTC timezone datetime for MongoDB storage. Because it
        # will be UTC already, it won't be converted to a different calculated
        # datetime by Pymongo.
        utc_b_date = local_b_date.astimezone(tzone["GMT"])
        student = {"first_name": "Jane",
                    "birth_date": utc_b_date,
        }
        # Insert into the "students" collection
        self.db.students.insert(student)
       
        # Retrieve student again for checking
        student_check = self.db.students.find_one({"first_name": "Jane"})
        # Set properly as UTC timezone
        utc_b_date_check = student_check["birth_date"].replace(
tzinfo=tzone["GMT"])
        # Then convert to local timezone
        local_b_date_check = utc_b_date.astimezone(tzone["NZDT"])
       
        # Check that retrieved timezone-aware data matches earlier data before storage
        self.assertEqual(local_b_date, local_b_date_check)
        self.assertEqual(utc_b_date, utc_b_date_check)
   
   
if __name__ == "__main__":
    unittest.main()

And the output:
.
----------------------------------------------------------------------
Ran 1 test in 0.004s

OK

Thursday 28 June 2012

Python, datetimes and timezones

Datetimes, UTC and Timezones


This week I was fortunate enough to be able to explore the issue of datetimes and making them timezone-aware in Python. I didn't want to use a ready-made module like pytz or dateutil to do this. My discoveries ended up not being used in production for my work, but they were very useful none the less!

The conventional wisdom is that datetime information for applications should be stored according to the Universal Coordinated Time (UTC) standard. This is especially important for a website with international users, and the storage of its datetime information in a database. When the data is first captured, say by a form being filled out, the data is probably in the client's local time. Then it gets stored in in UTC format internally by the application. When later it needs to be retrieved, it is converted back to the client's local time. For example:
  • a user's client sends datetime data from the "NZDT" timezone
  • the app stores a calculated datetime in the database in the "UTC" timezone 
  • when the user wants the data back, it is converted back again to a datetime for the local "NZDT" timezone. 
In this way, there will always be a consistent timezone for the storage of data, and a relevant timezone for retrieved data to be displayed in.

Timezones are an interesting thing, and they are based around the idea of offsets from the UTC (aka GMT) datetime value, which is the datetime always used as a base. As an example, the normal timezone for New Zealand is "NZST", which has an offset of +12 hours. You can find a list of timezone codes here. Timezones are also further complicated by the idea of Daylight Savings time, which may add or subtract additional hours to the UTC offset. Once Daylight Savings kicks in in my country, the timezone becomes "NZDT" and the offset becomes +12+1 hours.

Apply this to Python


Python has an abstract class named tzinfo, which you are meant to implement yourself for setting timezones on datetime.datetime objects. After reading the Python docs for tzinfo online, I was a bit puzzled about how exactly to do this. Luckily, I found a great answer on StackOverflow (like usual) and proceeded from there. Here it is:
class Zone(tzinfo):
    """ Sets some properties for the tzinfo abstract class. """
    def __init__(self, offset, isdst, name):
        self.offset = offset
        self.isdst = isdst
        self.name = name
       
       
    def utcoffset(self, dt):
        return timedelta(hours=self.offset) + self.dst(dt)
       
       
    def dst(self, dt):
        if self.isdst:
            dst = timedelta(hours=1)
        else:
            dst = timedelta(0)
        return dst
           
           
    def tzname(self, dt):
         return self.name

 I found the Daylight Savings start and end dates for my country, New Zealand, that I would need as well for calculations. These rules stated that:
"Daylight Saving commences on the last Sunday in September, when 2.00am becomes 3.00am, and it ends on the first Sunday in April, when 3.00am becomes 2.00am."

So, here is my implemention of the timezones I needed in a dictionary. This is for NZ time without daylight savings (NZST), NZ time including daylight savings (NZDT), and UTC time (UTC).

 tzone = {"GMT": Zone(0, False, "GMT"),
            "NZST": Zone(12, False, "NZST"),
            "NZDT": Zone(12, True, "NZDT")
            }

So, say you had an ordinary datetime value like:
dt = datetime.datetime(2012, 6, 28, 0, 0, 0)
This is a timezone-naive datetime object, since there is nothing set for the optional parameter at the end, "tzinfo". To make it timezone aware for my local timezone, NZDT, you'd do as follows:
dt = dt.replace(tzinfo=tzone["NZDT"])
Then to get this back as UTC datetime:
utc_dt = dt.astimezone(tz=tzone["GMT"])

To verify this (remember the local offset is +12 hours):
print(dt.strftime("%d/%m/%Y %H:%M:%S %Z"))
28/06/2012 00:00:00 NZST
print(utc_dt.strftime("%d/%m/%Y %H:%M:%S %Z"))
27/06/2012 12:00:00 GMT

Next, I wrote two functions that would give me the datetime values for the start and end of the daylight savings period for any given year, according to those pesky rules above. Then I thought: "How the heck am I going to test that my program can set a datetime with a correct timezone based on a correct daylight savings setting?" A divine inspiration hit me, and I decided to loop through all the days in the year 2012, and for each day: do just that!

For each day between 1/1/2012 till 31/12/2012 (inclusive), my script would calculate two datetime objects: one for the datetime at midnight (00:00:00) and one for after 3am, when Daylight Savings might have kicked in. And it did just that.

So, I added some more calculations to turn each of those local datetimes into datetimes with the timezone set as UTC.  This was so I could check that UTC conversion worked too. Below is the script in full, gentle reader.

#!/usr/bin/env python
#-*- coding: utf-8 -*-


from datetime import datetime, tzinfo, timedelta
from calendar import monthrange as cal_monthrange


class Zone(tzinfo):
    """ Sets some properties for the tzinfo abstract class. """
    def __init__(self, offset, isdst, name):
        self.offset = offset
        self.isdst = isdst
        self.name = name
       
       
    def utcoffset(self, dt):
        return timedelta(hours=self.offset) + self.dst(dt)
       
       
    def dst(self, dt):
        if self.isdst:
            dst = timedelta(hours=1)
        else:
            dst = timedelta(0)
        return dst
           
           
    def tzname(self, dt):
         return self.name
        
        
def get_start_month_get_last_sunday(year, month, NZDT, get_naive=True):
    """ Returns a datetime.datetime object for the start of the Daylight Savings period for NZ. """
    days_in_month = cal_monthrange(year, month)[1]
    days = [datetime(year, month, day) for day in range(20, days_in_month + 1)]
    days = [day for day in days if day.weekday() % 6 == 0]  # filter
    last_sunday = days[-1]
    if get_naive:
        ret_val = datetime(year, month, last_sunday.day, 2, 0, 0)
    else:
        ret_val = datetime(year, month, last_sunday.day, 2, 0, 0, tzinfo=NZDT)
    return ret_val


def get_end_month_first_sunday(year, month, NZST, get_naive=True):
    """ Returns a datetime.datetime object for the end of the Daylight Savings period for NZ. """
    days = [datetime(year, month, d) for d in range(1, 8)]
    days =[day for day in days if day.weekday() % 6 == 0]  # filter
    first_sunday = days[0]
    if get_naive:
        ret_val = datetime(year, month, first_sunday.day, 3, 0, 0)
    else:
        ret_val = datetime(year, month, first_sunday.day, 3, 0, 0, tzinfo=NZST)
    return ret_val
   

def is_dst(dt, tzone, dst_start, dst_end):
    """ Returns True if the datetime is within the Daylight Savings period. """
    if dt > dst_start:
        dst = True  # In DST zone for next year
    elif dt > dst_end:
        dst = False # Not in DST zone for current year
    else:
        dst = True  # In DST zone started in previous year
    return dst
   

def get_local_zoned_dt(dt, tzone, dst_start, dst_end):
    """ Returns the datetime.datetime object with its tzinfo assigned, based on the daylight savings period. """
    return (dt.replace(tzinfo=tzone["NZDT"])
            if is_dst(dt, tzone, dst_start, dst_end)
            else dt.replace(tzinfo=tzone["NZST"]))
           
           
def create_csv_for_year(start, end, dst_start, dst_end, tzone):
    """ Loop through the datetimes in a datetime range. Write data to a CSV file for inspection. """
    display_patn = "%d/%m/%Y %H:%M:%S %Z"
    one_day = timedelta(days=1)
    after_3am = timedelta(hours=3, minutes=1)
    logf = open("data/log.csv", "w")
    hdrs = "Local DT,UTC DT,Later DT,Later UTC DT"
    logf.write(hdrs + "\n")
    ind = start
    while ind <= end:
        # Get datetime values for local and UTC time.
        ind = get_local_zoned_dt(ind, tzone, dst_start, dst_end)
        ind_utc = ind.astimezone(tz=tzone["GMT"])
        # Get datetime values for local and UTC time, 3 hours later.
        later_dt = ind + after_3am
        later_dt = get_local_zoned_dt(later_dt, tzone, dst_start, dst_end)
        later_dt_utc = later_dt.astimezone(tz=tzone["GMT"])
       
        row = [ind.strftime(display_patn),
                ind_utc.strftime(display_patn),
                later_dt.strftime(display_patn),
                later_dt_utc.strftime(display_patn),
                ]
        logf.write(",".join(row) + "\n")
        ind += one_day
    logf.close()


if __name__ == "__main__":
    tzone = {"GMT": Zone(0, False, "GMT"),
            "NZST": Zone(12, False, "NZST"),
            "NZDT": Zone(12, True, "NZDT")
            }
    # Set timezone naive query start and end parameters.
    start = datetime(2012, 1, 1, 0, 0, 0)
    end = datetime(2012, 12, 31, 23, 59, 59)   
    # Get timezone naive datetime values for the start and end of the Daylight
    # Savings Period.
    dst_start_naive = get_start_month_get_last_sunday(start.year, 9, None, get_naive=True)
    dst_end_naive = get_end_month_first_sunday(start.year, 4, None, get_naive=True)
   
    # Get local datetime values for query start & end.
    start = get_local_zoned_dt(start, tzone, dst_start_naive, dst_end_naive)
    end = get_local_zoned_dt(end, tzone, dst_start_naive, dst_end_naive)
    # Get local datetime values for Daylight Savings start & end.
    dst_start = get_local_zoned_dt(dst_start_naive, tzone, dst_start_naive, dst_end_naive)
    dst_end = get_local_zoned_dt(dst_end_naive, tzone, dst_start_naive, dst_end_naive)
    # Write data for datetimes in the query range to a CSV file.
    create_csv_for_year(start, end, dst_start, dst_end, tzone)

Sunday 17 June 2012

Mercurial, Apache and mod_wsgi


In a previous post, you might have read that I was given a mission to set up a Mercurial code repository for users within a private network, to be accessed via passwords. What this meant was, use Apache web server to serve up a Mercurial repository as a live URL, with basic HTTP authentication via passwords.

Normally I access things stored in Mercurial through ssh protocol, not HTTP, so this was something I hadn't tried in a long time. My memory doesn't stretch back more than six months! So, I wrote down what I had to do to achieve this. I needed to make Mercurial, Apache, mod_wsgi and Python play nicely together.

I thought I would share the things I did to get things going. I'm assuming that you already know something about:
  • Apache HTTP server
  • Linux, the command line and apt for packages
  • Python and WSGI
  • Mercurial version control system (cloning)

If you're interested, the server was running slightly older versions of things: Ubuntu 10.04 and Python 2.6x. That's why I went with the slightly old mod_wsgi for Apache. Even though I did this on Ubuntu, I dare say these instructions would be almost identical for someone using Linux Mint, Crunchbang or Debian.

Here are the steps to follow, for a project named "booklister"

On the server


Make sure you have copied your project's source files over to the server first,
e.g. by zipping them and using scp

Install Apache and Mercurial
apt-get install apache2 mercurial

Install the mod_wsgi interface for Python
apt-get install libapache2-mod-wsgi

Restart Apache
/etc/init.d/apache2 restart
Install the Python module for Mercurial
pip install mercurial
Create an "hgusers" group for all users who will be working with the repository
groupadd hgusers

Add users into the group, using this format:
usermod -a -G GROUPNAME USERNAME

usermod -a -G hgusers scott
usermod -a -G hgusers anotheruser
You can view the users in the new group:
cat /etc/group

Setup password access for Mercurial via Apache: create user passwords for Apache authentication using the htpasswd utility

htpasswd -cb /etc/apache2/hgpasswd scott scottspassword
htpasswd -b /etc/apache2/hgpasswd anotheruser anotherpassword

Create a directory which will hold the Mercurial repositories.
Set permissions on it for the Apache user and the hgusers group
cd /var/lib
mkdir hg_repos
chown -R www-data:hgusers hg_repos
chmod -R g+rwx hg_repos

Initialise a Mercurial repository, and copy source project files into it. Add them all into version control.

hg init booklister
cd booklister
cp -r /home/scott/temp/booklister/* .
hg add
Create an hgrc file (for Mercurial) in the repository's .hg directory
vim .hg/hgrc

[ui]
username = "scott@mycompany.com"

[trusted]
groups = hgusers
users = scott, anotheruser

[web]
allow_read = scott, anotheruser
allow_push = scott, anotheruser
allow_archive = gz, zip, bz2
push_ssl=False
Create an hgweb.wsgi file (for mod_wsgi) in the repository directory
vim hgweb.wsgi

#!/usr/bin/env python


""" File for Mercurial, Apache and mod_wsgi http access.
Enable demandloading to reduce startup time.
"""


from mercurial import demandimport
from mercurial.hgweb import hgweb


# Path to repo or hgweb config to serve (see 'hg help hgweb')
config = "/var/lib/hg_repos/booklister"
demandimport.enable()
application = hgweb(config)

Make an initial commit for the new files, with a comment

hg commit -m "First commit"
Correct permissions on the repository files that may be new or changed

cd /var/lib/hg_repos
chown -R www-data:hgusers booklister
chmod -R g+rwx booklister
Edit the Apache config file (or a virtual host file). Add Directory and Location blocks.

vim /etc/apache2/apache2.conf

WSGIScriptAlias /booklister /var/lib/hg_repos/booklister/hgweb.wsgi
WSGIDaemonProcess booklister user=www-data group=hgusers threads=1 processes=10
<Directory /var/lib/hg_repos/booklister>
  Options ExecCGI FollowSymlinks
  AddHandler wsgi-script .wsgi
  AllowOverride None
  Order deny,allow
  deny from all
  allow from vvv.ww.xxx.y/zz
</Directory>

ScriptAlias /booklister "/var/lib/hg_repos/booklister/hgweb.wsgi"
<Location /booklister>
  AuthType Basic
  AuthUserFile /etc/apache2/hgpasswd
  AuthName "Book Lister project repo"
  Require valid-user
</Location>

Restart Apache
/etc/init.d/apache2 restart


Locally


Clone the remote repository in a desired location. You'll now have a local copy!

cd /home/scott/ws/py
hg clone http://myserver/booklister

-Enter user name and password

scott
scottspassword

Create a local rc file for Mercurial. You can do this in your home ~ directory,
or in the clone project's folder. I usually put mine in the project's folder.

cd booklister
vim .hg/hgrc

[ui]
username = Scott Davies <scott@mycompany.com>
password = scottspassword

[paths]
default = http://myserver://booklister

Try making a change to file, then try to push the change to the repository

hg commit -m "Test change"
hg push

pushing to http://myserver//booklister
http authorization required
realm: Book Lister project repo
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files