Programming

Removing HTML from Python Strings

Posted in Programming on October 15th, 2011 by Simon Connah – Be the first to comment

It is incredibly easy to remove HTML from a Python string using the lxml library. All you need to do is something similar to the below snippet:

# remove HTML from the string content and place the result in doc
doc = html.document_fromstring(content)
# get the text content of doc (with no markup) and place the result in text_doc 
text_doc = doc.text_content()

thus text_doc now contains the text of the original string with no HTML markup in it. Now you can use something like Postmarkup so that users can use BBCode or a similar markup tool to style the text that they place on your site without having the security risk of allowing users to post HTML.

If you are using Django it is essential that you mark your strings as safe using the safe template tag so that Django does not automatically escape the HTML created legitimately from the BBCode.

A full example that uses Postmarkup and lxml to first remove any HTML in a post and then render the BBCode into HTML for display on a site:

from postmarkup import render_bbcode
from lxml import html
 
doc = html.document_fromstring(content)
bbcode = doc.text_content()
content = render_bbcode(bbcode)

AVR Assembly on Mac OS X – Your First Program

Posted in Programming on December 18th, 2010 by Simon Connah – Be the first to comment

In the past I wrote an article about writing assembly programs for AVR devices on the Mac but missed a couple of important points.

The major problem with my original article was that I was passing everything through the C compiler rather than just using avr-as as one should. The other problem was that I was using the libc headers to get the names of the specific ports. The better solution is to take the include files that Atmel supply with AVR Studio 4 on the PC and just make a couple of small changes to make them work using avr-as. The include file I use for the ATmega168 can be found here.

By using this approach there is basically no difference between the assembly that you would use in AVR Studio 4 and that you can use on your Mac. This makes the entire development process extremely easy.

So lets list the source code of our updated LEDon.S file from the original article:

.include "/home/dev/m168.h"
 
	rjmp init
 
init:
 
	ser r16
	out DDRB, r16
	out DDRD, r16
 
	clr r16
 
	out PORTB, r16
	out PORTD, r16
 
	.global main
 
main:
 
	sbi PORTB, 0
	rjmp main

it should look pretty similar to the original.

The only real difference comes when you want to assemble it and produce a hex file which you then upload with AVRdude (as documented here). So you do that with these commands (assuming you have saved the above file as ledon.S and that the header file I provided is stored in /home/dev – if it is different change the include line to point to the correct path):

avr-as -mmcu=atmega168 -o ledon.out ledon.S
avr-objcopy -O ihex ledon.out ledon.hex

this should result in a nice ledon.hex file.

Now that you can use the “raw” assembler you can just read the data sheets and the AVR instruction reference and use the port names as is rather than having to use the rather stupid C macros that AVR libc provides. Obviously this technique is completely cross platform and will work on any UNIX based machine, not just Macs.

Exploring the Django Ecosystem

Posted in Programming on June 2nd, 2010 by Simon Connah – Be the first to comment

Since my last post I’ve spent some time looking at all the options available to Django developers. There are a vast array of different approaches one can take and the frameworks included in the core distribution are pretty solid and well documented.

I am aware of the Pinax project which offers lots of already written Django modules but I checked fairly recently and they were still behind the Django releases by quite a margin. Whilst the project looks good and I will certainly be keeping an eye on it, I don’t think I would make use of it in a serious project. Especially considering some of the changes in Django 1.2 are so good.

One element of Django that I overlooked when initially going through the documentation was the generic views feature. At first glance this seems like either a redundant feature or one that is better used as a small prototyping feature but after closer inspection it is obvious that this is an extremely powerful tool. The fact that you can delegate the entire view code to the main Django distribution not only simplifies your application but also reduces the number of potential bugs that your application contains.

Simply put any page that either displays a list of objects of a certain model type or a single object of a certain model type is ripe for use with the generic page view feature. Thus all you need to do is implement a template and that will handle how the view is displayed, all the correct data will automatically be passed to the template for you.

The next important tool to talk about is the built in comment framework. This simplifies the process of allowing users to post comment against articles and the like. The first time I used Django I actually wrote a simple blog application that implemented this feature itself but while it was a little more flexible the comment framework is stable, well documented and maintained by more developers. Any failings of the comment framework are going to be offset by the increased time spent fixing bugs in your own implementation and possibly a poor design in the first place.

The only complaint I have is the somewhat simplistic moderation system that is currently in place. It does the trick, but is not exactly a killer feature.

The final thing I want to talk about is the cache framework and memcached. I was impressed when I saw that the cache framework included not only a site wide setting but one which allowed caching on a per view basis. This could potentially allow you some interesting possibilities to optimise certain parts of your site say during a Slashdot stampede while the rest of the site remains untouched. My only concern would be that the control does not extend to the object level as far as I can see. It certainly allows you access to a low level cache API which allows the caching of data when you need it but it would be nicer to have something to automatically cache an object given a certain set of parameters.

Anyway this has been my first post reflecting my initial exploration of the framework. I’m keen to keep going as Django really seems the perfect mix of simplicity and flexibility. Stay tuned for more information in the coming weeks.

A Look at Django from a New Users Perspective

Posted in Programming on May 18th, 2010 by Simon Connah – 2 Comments

For the last couple of weeks I have been using Django to build a website I have been meaning to work on for quite some time. I thought I would post some of my impressions of the web framework as a new user to it in the hope that someone finds it useful. I’ll save the reasons why I chose Python and Django for another post.

The first thing that I noticed was the conceptual simplicity of the Django framework. I have become accustomed to frameworks which are ridiculously complicated compared to the goal that they are actually trying to achieve. So complex in fact that instead of looking good, the authors just end up looking stupid for getting so caught up in themselves. Java and C# frameworks seem to be the worst offenders when it comes to this phenomenon.

Setting up views and mapping URLs to those views is also extremely easy and the code that then loads a template from within said view is not going to be more than a couple of line (unless you are passing a ridiculous number of variables to the template system). This enables you to setup some complex views with a bare minimum of code. Another winning feature in my mind is the ease of producing JSON output. Whilst this is a feature of the Python standard library, querying the database and then immediately parsing it into JSON format for consumption by clients or indeed your own Javascript code on your website enables you to create complex web services with the bare minimum of fuss.

One of the features I need most when it comes to web frameworks is an easy way develop the application. Some web development tools require some rather complex setup to test on your own development server but I have found Django to be amazingly simple on this front. One technique I use is to set it up to use an SQLite 3 database on the local machine and test there whilst in the process of actual development and then transfer everything to the development server, change the database settings to a PostgreSQL database server, sync the database and then test the code to make sure that everything works as expected. This has lead to a much faster and easier to manage development cycle for me as I do not have to have my development server running (for PostgreSQL and Apache) in order to test the website. Instead I simply rely on the integrated development server provided by the manage.py script.

The integration of Markdown within the Django framework also allows you to let users create rich text in comments, forum posts and messages to one another which will certainly be appreciated by them. The only slight annoyance I have with it is that you can not limit the allowed markdown syntax which means that users can post comments in a header one style which potentially ruins your SEO for a given page.

Django’s project management features are pretty good too. I tend to use BBEdit on Mac OS X for my web app development and using that applications project management along with the separation of different parts of the web application by Django ensures that you never get overwhelmed by a bunch of random files that you need to remember their purpose. This makes it easy to develop modular parts of your site (a comment system for instance) independently of the parts which will actually make use of the comment system.

Web services, as I have said earlier, are the bread and butter of Django. It is rare nowadays for any serious website to only have web browser clients. Most offer some form of integration with desktop and mobile clients.

Overall I’m highly impressed with both Django and Python as a language. As this is the first real project that I have used it in it has been an eye opener. The Python standard library complements Django perfectly and being able to use the wealth of third party Python libraries available makes just about any conceivable task that you may have easy.

So what are you waiting for? Try it out. If you have any questions leave a comment and I’ll get back to you ASAP.

Note:

I started writing this article using Django 1.1.1 but since then Django 1.2 has been released. I have not had time to test the new release properly but some of the new additions certainly look good and a couple of small issues I had have been fixed. So upgrade if you can.

Creating, Compiling and Linking Your First AVR Assembly Program on Mac OS X

Posted in Programming on April 11th, 2010 by Simon Connah – Be the first to comment

Update 3: Although this article is now obsolete (see update 2 below) I thought I’d at least make sure the code blocks display properly using the new syntax highlighting system in use.

Update 2: I have rewritten this article here. If you want to do AVR programming in assembly on Mac OS X then read that article. This one is just kept around for historical reasons and to list another option if you want to do it a different (albeit harder) way.

Update: After a little bit more careful study of the AVR libc library I realised that you can just make use of the macros it contains in order to access ports etc without having any ugly defines.

I struggled to get up to speed with creating my first assembly program for my AVR device. The problem is that the tutorials and information available either target C for the AVR GCC tool chain (doesn’t help me much as I want to use assembly) or target assembly but using another assembler. There does not seem to be any specific information that I could get to work. I hope that this article will help if you find yourself in the same position.

So the first thing you need to do is build your GCC tool chain: here are some instructions. Once that is done you should be all ready to get started.

First things first, I am assuming you’ll be using avr-libc to make use of all the defines which it sets for you (register names etc), gcc, gdb and avrdude (at a minimum).

First things first I tend to define the device I am using at the top of the source file just in case. Then we need to include the AVR libc file which gives us the nice names for ports etc.

#ifndef __AVR_ATmega168__
#define __AVR_ATmega168__
#endif
 
#include <avr/io.h>

We include the AVR libc io.h header file so that we can use the libc defines to access the relevant ports by name rather than having to use their real address or having to define them ourselves.

Now we move on to the actual assembly program. At first I was stuck because the linker complained that no main method was declared. I found this somewhat perplexing until I remembered that you need to declare a main method using the .global statement. So don’t forget to have a section defined with a .global main.

     rjmp Init
 
Init:
 
     ser r16
     out _SFR_IO_ADDR(DDRB), r16
     out _SFR_IO_ADDR(DDRD), r16
 
     clr r16
 
     out _SFR_IO_ADDR(PORTB), r16
     out _SFR_IO_ADDR(PORTD), r16
 
     .global main
 
main:
 
     sbi _SFR_IO_ADDR(PORTB), 0
     rjmp main

The above code is simply meant to turn an LED on and was (very slightly) adapted from the “AVR: An Introductory Course” book.

It is vital that you save this file with an extension of .S (that is a capital S) as that is what tells GCC to run it through the assembler.

We will compile it with the following command:

avr-gcc ledon.S -mmcu=atmega168 -Os -g -o ledon.out

you will need to substitute the atmega168 part with the correct device that you are using. Failure to do so will result in a program compiled for the wrong chip.

We then need to create a hex file from the output which we can then upload to our device using AVRdude. Use this command:

avr-objcopy -O ihex ledon.out ledon.hex

You can find out how to upload it to the device here.

And there you have it. Debugging it seems somewhat more complex if you don’t have a JTAG (such as the Arduino, I’ll need to look into getting something like the AVR Dragon before I can use AVaRICE and GDB together).

Compiling for AVR Devices

Posted in Programming on April 7th, 2010 by Simon Connah – Be the first to comment

Another small note for myself here.

In order to compile for AVR devices it is essential that you define which AVR you are targeting on the command line with your version of GCC. Chances are you will be targeting an ATmega168 (use the -mmcu=atmega168 command line switch) as that seems to be the most prevalent hobby chip. It should also be noted that it is even more important when compiling code for embedded devices to use the -Wall command line option.

Also for future use the avr/delay.h file is now util/delay.h (which gets me every time).

Compiling Boost on Mac OS X

Posted in Programming on August 20th, 2009 by Simon Connah – Be the first to comment

I often see people asking how to compile the Boost C++ libraries on Mac OS X so I thought I would post a quick tutorial.

  1. Download B-jam from http://www.boost.org/ – you will find it in the “Downloads” area of the site. Make sure you download the Mac OS X binary file to avoid problems.
  2. When B-jam has been download, extract the archive by double clicking on it.
  3. Move the executable entitled “bjam” to a directory on your path. If you don’t know what this means then just open the Terminal application and type “cd ” (the trailing space is very important) then drag the folder which you just extracted onto the Terminal window, click once in the Terminal window and then press return.
  4. Type “sudo mv bjam /usr/bin” exactly as shown, you will then be asked to type your administrator password. Once you have typed it in press return and bjam will now be accessible from the command line.
  5. Next download the source code distribution of Boost from http://www.boost.org/ – this is found in the same place as B-jam.
  6. Extract the archive and then open the folder in Terminal using the same method outlined above.
  7. Type “bjam stage –build-type=complete” and press return. Wait for a couple of hours for it to finish and then type “sudo make install” to install the boost libraries on your machine.

Update: For those who want to build just the 64 bit versions of the shared library that are thread safe in release mode (this is a much, much faster build time) then you can substitute the build command in stage 7 above with this one:

bjam toolset=darwin link=shared threading=multi runtime-link=shared variant=release address-model=64 stage

Update 2: To build a 32 bit version for PowerPC Macs and a 32 bit and 64 bit version for Intel Macs in a universal binary you need to substitute phase 7 listed above for the following command:

bjam toolset=darwin threading=multi variant=release link=shared runtime-link=shared address-model=32_64 architecture=combined

Using Open Source Libraries on Mac OS X

Posted in Programming on August 20th, 2009 by Simon Connah – Be the first to comment

Mac OS X has a rich development environment that is at least the equal of Linux in terms of diversity and available features. For a start Apple supply Xcode, which in my opinion, is one of the best IDEs available on any platform. Included with the Xcode IDE is a large set of tools including GCC, NASM, Flex, Bison, Make and many other tools traditional UNIX programmers will find familiar.

I want to show the more traditional Mac users who perhaps only know about Objective-C and Cocoa what the possibilities are when it comes to using the vast array of open source software on Mac OS X.

The first thing that should be said is that open source libraries normally come in source code form and thus require you to compile them yourself. This is actually a very easy process, and normally just requires you to issue the following commands:

./configure
 
make
 
sudo make install

from the Terminal application. Obviously this will only compile and install the default package, if you want more control over what you want installed or where you want it to be installed then you will need to supply some options to the configure script.

In order to find out what options are available to you just type the following command:

./configure --help

and it should list all the available options that you can pass to the configure script. The default install location for most libraries will be /usr/local which in 99% of cases is fine.

Assuming there are no missing dependencies for the particular library you are trying to compile the configure script should run successfully. Once that is done you can then compile the library using the make command. Once it has compiled all you need to do is install it using the sudo make install command.

Now that you have the library installed you will most likely want to make use of it from within your Xcode projects. This is an area that many people seem to find somewhat confusing so I will try to keep this as simple as possible.

There are two stages to using a library (other than the ones that come with Xcode / the operating system itself). First you must tell Xcode where to locate header files and secondly you must tell Xcode where to find the compiled dynamic or static libraries. The first stage is very simple indeed, with your project open go to the Project menu and select “Edit Project Settings”. This will bring up the build configuration window where you can set numerous options to do with compilation and deployment relating to your project.

Look for an item called “Header Search Paths” in the Search Paths section shown in the image below.

Xcode Build Settings

Xcode Build Settings

My project as you can see already has the header and library search path defined. In this case it tells the compiler to look in /usr/local/pgsql/include for header files and /usr/local/pgsql/lib for library files. If you see a star after the path it means that you have told Xcode to search recursively all sub folders of the specified path for the header and library files that you have included. Be careful with this option as it can lead to some odd build errors but that has only happened to me once so your milage may vary.

When passing the default values to a configure script it almost always installs the files in /usr/local. Given this assumption you should put /usr/local/include in the Header Search Paths box and /usr/local/lib in the Library Search Paths box. Obviously if you told the configure script to install it somewhere else then adjust the paths accordingly.

The last step is to add the library itself to the Xcode project. With your project open go to the Project menu again and select the option “Edit Active Target” (it will have the name of your application after that). This will open the following window:

Xcode Library Settings

Xcode Library Settings

As you can see the bottom panel already includes two libraries in my project and both of them are required dependencies. You can add your own library dependencies by just clicking the bottom “+” button and selecting the required library / framework from the list that pops up (assuming you set the paths correctly in the preceding set of instructions).

That is pretty much all there is too it. Hopefully that will have opened up a whole wealth of open source software which you can now make use of in your own projects (make sure you comply with their licenses though :) ); everything from audio, graphics, maths, science and language is covered in one form or another by open source software so I am sure that there will always be something that you find useful after a little searching.

As always, if you have any questions or comments please post below and I will try to get back to as soon as possible.

Compiling and Installing NASM on Mac OS X

Posted in Programming on August 19th, 2009 by Simon Connah – 6 Comments

First things first, make sure you have installed the latest version of Xcode from the Apple Developer website (the version that comes on the Mac OS X DVDs is most likely an older version). Once that is installed you need to download the latest source code distribution of NASM from here, it is located under the Download tab.

Once you have the file downloaded (it should be called something like nasm-x-xx.tar.bz2 – if it has RC in the name download the next lowest number that does not have RC in the file name as we want to use a stable version for the purposes of learning) just double click on it to extract its contents into a folder.

Now open up the Terminal application which is located in /Applications/Utilities/ and type “cd” followed by a space. Then drag the folder that was just extracted from the nasm archive and drop it onto the Terminal window. This should place the path to the folder after the cd and space. If it looks something like:

cd ~/Downloads/nasm-2.07

then you have done it correctly. Press return and the terminal should place you inside the nasm distribution folder. Type the following commands exactly as they appear below:

./configure
 
make
 
sudo make install

after you have typed sudo make install it will ask you for your password. This is your administrator account password. It will not show anything as you type as a security measure so make sure you don’t forget how far you have got :) .

Now inside the Terminal window if you type

nasm -v

you should see the following (or something similar) output:

NASM version 2.07 compiled on Aug 13 2009

congratulations, you now have the latest version of NASM installed. If you have any questions please leave a comment and I will try my best to answer them (perhaps in a new blog post).