Posts Tagged ‘Django’

Removing HTML from Python Strings

Posted in Programming on October 15th, 2011 by Simon Connah – Be the first to comment

It is incredibly easy to remove HTML from a Python string using the lxml library. All you need to do is something similar to the below snippet:

# remove HTML from the string content and place the result in doc
doc = html.document_fromstring(content)
# get the text content of doc (with no markup) and place the result in text_doc 
text_doc = doc.text_content()

thus text_doc now contains the text of the original string with no HTML markup in it. Now you can use something like Postmarkup so that users can use BBCode or a similar markup tool to style the text that they place on your site without having the security risk of allowing users to post HTML.

If you are using Django it is essential that you mark your strings as safe using the safe template tag so that Django does not automatically escape the HTML created legitimately from the BBCode.

A full example that uses Postmarkup and lxml to first remove any HTML in a post and then render the BBCode into HTML for display on a site:

from postmarkup import render_bbcode
from lxml import html
 
doc = html.document_fromstring(content)
bbcode = doc.text_content()
content = render_bbcode(bbcode)

Debian, Django, Nginx, FastCGI and Daemon Tools – The Perfect Combination

Posted in General on March 29th, 2011 by Simon Connah – 1 Comment

Deploying a Django application can be quite confusing especially if you are a developer and not normally involved in system administration work. So what is the best configuration for hosting multiple Django sites on the same server with the lowest administrative overhead possible?

Having tried Gunicorn and Nginx (which admittedly was easier to get working for a single site) I have to say that hosting Django apps using Nginx, FastCGI and Daemon Tools is by far the best method. It avoids having to write init scripts entirely (a task which is still somewhat nasty especially considering the alternatives on other non-Linux platforms such as launchd on Mac OS X which is infinitely superior).

Python

The only real requirements on the Python side are that you have virtualenv and flup installed. Having said that using Pip and virtualenvwrapper is also highly recommended in all Python based deployments no matter what the framework(s) you are using.

Also due to the fact that many Python projects make use of Mercurial you should also take the time to learn it if you do not already know how to use it.

Nginx

The first task is to install nginx. I always install nginx from source as it allows me to cut out some the modules that I have no need for and thus reduce memory footprint a bit more. You can decide if you want to install from source or not.

First we need to install the dependencies for building nginx (I am using Debian 6 here so change it to whatever your platform requires):

apt-get install gcc libpcre3-dev libssl-dev make

Then run configure, make and make install:

./configure --prefix=/opt/nginx --user=nginx \
--group=nginx --with-http_ssl_module --with-http_realip_module \
--with-http_secure_link_module --with-http_gzip_static_module \
--without-http_scgi_module --without-http_uwsgi_module \
--without-http_autoindex_module --without-http_ssi_module \
--without-http_split_clients_module --without-http_userid_module \
--without-http_memcached_module --without-mail_pop3_module \
--without-mail_imap_module --without-mail_smtp_module \
--without-http_auth_basic_module --without-http_charset_module
 
make
 
make install

(you need to be root or use sudo when running make install).

Now we add the user for nginx:

adduser --system --no-create-home --disabled-login \
--disabled-password --group nginx

Once that is done nginx is ready to go. We just need to setup an init script for it. Since I host with Linode (who are pretty darn good) I use the init script that they provide. If you use a distribution other than Debian / Ubuntu you might have to make some changes or get one specifically for your distribution.

wget http://library.linode.com/webservers/nginx/installation/reference/init-deb.sh
 
mv init-deb.sh /etc/init.d/nginx
 
chmod +x /etc/init.d/nginx
 
/usr/sbin/update-rc.d -f nginx defaults

Now nginx is installed. So lets start it and move onto the next stage. We will configure the nginx page separately after we have configured daemon tools.

/etc/init.d/nginx start

Daemon Tools

Now lets install Daemon Tools.

apt-get install daemontools-run

Once that has been completed create a directory to store our first sites configuration file. You will create a directory for every project you wish to run on the server. I generally tend to name each folder after the domain of the site to make it easy to tell which directory holds the configuration for each site (although honestly it is unlikely you will need to play around with these configuration files once they are setup and working properly).

mkdir /etc/service/domain.com

Now just create our configuration file for each Django project. Save it as a file called “run” (without the quotes) in the folder you created above for you domain name and then enter something like this:

#!/usr/bin/env bash
 
source /path/to/your/virtualenv/for/project
PROJ_DIR=/path/to/django/project
 
exec envuidgid username python $PROJ_DIR/manage.py \
    runfcgi method=threaded minspare=1 maxspare=2 host=127.0.0.1 \
    port=9001 pidfile=$PROJ_DIR/django_fcgi.pid daemonize=false

the configuration shown above is for a very low traffic site so if you want to use it in production increase the number the maxspare option to something like 5 – 10 and the minspare to at least 2. Be aware that this will require quite a bit of free RAM so you might have to play around with options depending on how much you have available. You will also need to change “username” in the above example to the name of the user you wish to run the Django apps process.

Now lets us return to our nginx configuration.

Nginx

In order to avoid problems with displaying URLs in our Django application we need to add a line to nginx’s default fastcgi_params file (note that this tutorial was written using version 0.8.54 of nginx, if you are using a different version it might be worth checking if it already exists before adding it).

So just add the following line to the fastcgi_params file:

fastcgi_param  PATH_INFO          $fastcgi_script_name;

then we need to add the following line to our Django projects settings.py file (personally I have a special production_settings.py file which I have imported in my settings.py file which I leave commented out during development and only uncomment when I deploy to production):

FORCE_SCRIPT_NAME = ''

then we just add the correct configuration for our domain to our nginx configuration file:

server {
    listen 80;
    server_name domain.com;
    rewrite ^/(.*) http://www.domain.com/$1 permanent;
}
 
server {
    listen 80;
    server_name www.domain.com;
 
    access_log /srv/www/domain.com/logs/access.log;
    error_log /srv/www/domain.com/logs/error.log;
 
    root /srv/www/domain.com/public_html;
    index index.html;
 
    location / {
        try_files $uri @django;
    }
 
    location /static {
        alias /path/to/project/static;
    }
 
    location /media {
        alias /path/to/project/media;
    }
 
    location @django {
        include /opt/nginx/conf/fastcgi_params;
        fastcgi_pass 127.0.0.1:9001;
        fastcgi_pass_header Authorization;
        fastcgi_intercept_errors off;
    }
}

and away we go.

This is the basic setup fully configured. All we need to do now is make sure that Daemon tools is running the correct version of our app and that nginx has all the changes we have made loaded. So simply restart out domains service:

svc -du /etc/service/domain.com/

and restart nginx:

/etc/init.d/nginx restart

for reference if you just want to stop your Django app:

svc -d /etc/service/domain.com/

or start it:

svc -u /etc/service/domain.com/

and now you should be able to view your site (assuming DNS has propogated) and see the spender of your new Django application running on possibly the easiest to maintain and update Django web stack available. Hopefully you use some form of distributed source code control such as Mercurial or Git. If so you should be able to update you project simply by pulling the latest version and then just restart the Daemon Tools service with:

svc -du /etc/service/domain.com/

I hope you have found this article useful. If so (or even if you have not) leave a comment. Also leave a comment if you have any problems and I’ll try and help you out.

Exploring the Django Ecosystem

Posted in Programming on June 2nd, 2010 by Simon Connah – Be the first to comment

Since my last post I’ve spent some time looking at all the options available to Django developers. There are a vast array of different approaches one can take and the frameworks included in the core distribution are pretty solid and well documented.

I am aware of the Pinax project which offers lots of already written Django modules but I checked fairly recently and they were still behind the Django releases by quite a margin. Whilst the project looks good and I will certainly be keeping an eye on it, I don’t think I would make use of it in a serious project. Especially considering some of the changes in Django 1.2 are so good.

One element of Django that I overlooked when initially going through the documentation was the generic views feature. At first glance this seems like either a redundant feature or one that is better used as a small prototyping feature but after closer inspection it is obvious that this is an extremely powerful tool. The fact that you can delegate the entire view code to the main Django distribution not only simplifies your application but also reduces the number of potential bugs that your application contains.

Simply put any page that either displays a list of objects of a certain model type or a single object of a certain model type is ripe for use with the generic page view feature. Thus all you need to do is implement a template and that will handle how the view is displayed, all the correct data will automatically be passed to the template for you.

The next important tool to talk about is the built in comment framework. This simplifies the process of allowing users to post comment against articles and the like. The first time I used Django I actually wrote a simple blog application that implemented this feature itself but while it was a little more flexible the comment framework is stable, well documented and maintained by more developers. Any failings of the comment framework are going to be offset by the increased time spent fixing bugs in your own implementation and possibly a poor design in the first place.

The only complaint I have is the somewhat simplistic moderation system that is currently in place. It does the trick, but is not exactly a killer feature.

The final thing I want to talk about is the cache framework and memcached. I was impressed when I saw that the cache framework included not only a site wide setting but one which allowed caching on a per view basis. This could potentially allow you some interesting possibilities to optimise certain parts of your site say during a Slashdot stampede while the rest of the site remains untouched. My only concern would be that the control does not extend to the object level as far as I can see. It certainly allows you access to a low level cache API which allows the caching of data when you need it but it would be nicer to have something to automatically cache an object given a certain set of parameters.

Anyway this has been my first post reflecting my initial exploration of the framework. I’m keen to keep going as Django really seems the perfect mix of simplicity and flexibility. Stay tuned for more information in the coming weeks.

A Look at Django from a New Users Perspective

Posted in Programming on May 18th, 2010 by Simon Connah – 2 Comments

For the last couple of weeks I have been using Django to build a website I have been meaning to work on for quite some time. I thought I would post some of my impressions of the web framework as a new user to it in the hope that someone finds it useful. I’ll save the reasons why I chose Python and Django for another post.

The first thing that I noticed was the conceptual simplicity of the Django framework. I have become accustomed to frameworks which are ridiculously complicated compared to the goal that they are actually trying to achieve. So complex in fact that instead of looking good, the authors just end up looking stupid for getting so caught up in themselves. Java and C# frameworks seem to be the worst offenders when it comes to this phenomenon.

Setting up views and mapping URLs to those views is also extremely easy and the code that then loads a template from within said view is not going to be more than a couple of line (unless you are passing a ridiculous number of variables to the template system). This enables you to setup some complex views with a bare minimum of code. Another winning feature in my mind is the ease of producing JSON output. Whilst this is a feature of the Python standard library, querying the database and then immediately parsing it into JSON format for consumption by clients or indeed your own Javascript code on your website enables you to create complex web services with the bare minimum of fuss.

One of the features I need most when it comes to web frameworks is an easy way develop the application. Some web development tools require some rather complex setup to test on your own development server but I have found Django to be amazingly simple on this front. One technique I use is to set it up to use an SQLite 3 database on the local machine and test there whilst in the process of actual development and then transfer everything to the development server, change the database settings to a PostgreSQL database server, sync the database and then test the code to make sure that everything works as expected. This has lead to a much faster and easier to manage development cycle for me as I do not have to have my development server running (for PostgreSQL and Apache) in order to test the website. Instead I simply rely on the integrated development server provided by the manage.py script.

The integration of Markdown within the Django framework also allows you to let users create rich text in comments, forum posts and messages to one another which will certainly be appreciated by them. The only slight annoyance I have with it is that you can not limit the allowed markdown syntax which means that users can post comments in a header one style which potentially ruins your SEO for a given page.

Django’s project management features are pretty good too. I tend to use BBEdit on Mac OS X for my web app development and using that applications project management along with the separation of different parts of the web application by Django ensures that you never get overwhelmed by a bunch of random files that you need to remember their purpose. This makes it easy to develop modular parts of your site (a comment system for instance) independently of the parts which will actually make use of the comment system.

Web services, as I have said earlier, are the bread and butter of Django. It is rare nowadays for any serious website to only have web browser clients. Most offer some form of integration with desktop and mobile clients.

Overall I’m highly impressed with both Django and Python as a language. As this is the first real project that I have used it in it has been an eye opener. The Python standard library complements Django perfectly and being able to use the wealth of third party Python libraries available makes just about any conceivable task that you may have easy.

So what are you waiting for? Try it out. If you have any questions leave a comment and I’ll get back to you ASAP.

Note:

I started writing this article using Django 1.1.1 but since then Django 1.2 has been released. I have not had time to test the new release properly but some of the new additions certainly look good and a couple of small issues I had have been fixed. So upgrade if you can.