automated rsync backup

sometimes i need to backup different servers. instead of backing up a each single server to a long term storage i usually copy the important files from every server to a single servers partition and then only backup this single partition. sometimes backups can contain quite sensible data, so this has to be done in a secure way. it also has to be done automated so no human is around to type in the backup servers password.

one approach is using scp or rsync. i usually go for rsync because it can reduce network traffic massively sometimes. to set it up relatively securely i add an ssh dsa key for the root user of each machine that has some files to backup. ssh-keygen -t dsa do not enter a password, because the backup has to happen automated there is no one around to type it in. use the proposed file names for the private and public keys.

now you can create a script which copies your data, for example an sql dump to the backup server. instead of using the root account we will create a backup account. rsync sqldump.sql backup@backupserver:/backups/client1/sqldump.sql this command doesn’t work yet. first we have to create the user backup on the server.

after that it should work. but there is one problem. it asks for a password. for that we have to take the generated dsa public key and append it’s content (one line of gibberish) to the backup accounts ~/.ssh/authorized_keys file. if this file or directory doesn’t exist yet you can create it. but you have to check that only the owner of the file has write or read permissions. if others can read that file some linux flavours won’t allow clients to connect.

if this worked you can now copy the backups from the clients with a cron job. no one asks for a password any more. that’s nice but a bit insecure. if one of the client machines will be hacked the intruder can connect to the backup server and read all backed up files. that’s not really nice. to prevent that we can limit users to execute only one single command on the backup server. rsync executes on the target server for our command the following command: rsync --server -e.L . /backups/client1/sqldump.sql

if we go into the authorized_keys file we need to prepend the following part before ssh-dss in front of our public key line. there is also some additional code to disable some ssh functionality like creating a tunnel. command="rsync --server -e.L --inplace . /backups/client1/sqldump.sql",\ no-port-forwarding,no-X11-forwarding,no-agent-forwarding,\ no-pty ssh-dss XX...YY== client1 public key

after that what ever command you try to execute on the backup server connectiong to the backup user the above is executed. so the most terrible thing an attacker can do is overwrite the sqldump file on the backup server.

i can’t guarantee that this setup is really secure or if there might be some holes. i’m no sysadmin and have only limited knowledge about ssh, so use it at your own risk.

Router with failover line

The last month i worked in the Nairobi Office. For internet we have a KDN-Fibre. Because it’s so often down we also have a KDN-Wimax line. But even with this configuration sometimes we have downtimes so we also ordered a ZUKU Wimax line.

Both KDN lines use the same IP. I don’t think it’s easy possible to have them both plugged into the router for redundancy. So when the fiber is down someone has to unplug and plug the wimax cable. for the ZUKU line i wanted a better solution. if KDN has a problem the ZUKU line should take over automatically. The first approach was to use an old PC with linux installed and three network cards. But there are a few serious drawbacks. A whole PC uses lots of Power. It’s expensive, heats up the office, it’s not so green and if we have a power cut it drains the backup battery faster. A router with redundant line capabilities is quite expensive… and also uses lots of power. It’s also over sized for our little office.

But there is openwrt and a few similar projects. They provide a linux distribution which you can install on a few cheap and small wireless routers. These routers with almost no functionality can get very powerful devices with a proper Operating System installed on them.

It was really hard to find a supported router in Nairobi. In Switzerland you can get them in lots of online and offline shops. Here we had to find someone who specialized only in routers. The one we bought was a Linksys WRT54GL. The next Step was to install the openwrt OS on it. There are two flavors, used the Kamikaze, it’s more modern! You can use the original webinterface to upload the new firmware. There is a Page with all the ways how to install it. The router now has a very powerful web interface. You can already add multiple WAN interfaces. I created one for the ZUKU line called zukuwan. You also have to add a third vpn (*1) which uses one of the LAN ports and uses the zukuwan network. i also had to add that zukuwan network to the WAN zone of the firewall. To modify the firewall settings i had to install the luci firewall packet (*2).

You can check if both WAN lines are properly working when you plug only one WAN line and restart the router.

A nice guy created a script which ads line balancing and failover support. For this some rather complex routing stuff is required. The script manages that and also checks if a line is down. You have to add the packages multiwan and luci-app-multiwan (config interface) from https://forum.openwrt.org/viewtopic.php?id=23904.

After you installed the packages you have a multi-wan configuration page in the network menu. Remove there the wan2 config and add a new one called zukuwan or however you called your second WAN connection. You can also remove all of the entries in mwanfw. The default route should be fastbalancer. Set the failover_to for each interface to the other one.

After this it should work. If it’s not working ther is a way to check. You can connect via ssh to the router if you changed your password. There type ip route show table 123 This should give two default routes. If you plug out one cable it should take about 10 to 20 seconds and then this route should be removed.





(*1) Go to the Administration/Network/Switch Page. Add an interface ethX.2. set it to the ports 0 and 5. Remove Port 0 from ethX.0. For the second WAN connection set eth0.2 as Interface.

(*2) Go to Administration/Overview/LuCI Components. Check luci-app-firewall package and press the Install button. After a restart you have the Administration/Network/Firewall config page. Select wan and zukuwan for the wan Zone.

Merge and split PDFs

To make one PDF file out of many is not so hard on Linux. Usually there is ghostscript already installed. You can use gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=merged.pdf\ file1.pdf file2.pdf file3.pdf to do that. instead of providing every single filename something like *.pdf is more usable.

To extract a few pages out of one big PDF you can use: gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=pages_10_to_15.pdf\ -dFirstPage=10 -dLastPage=15 bif_file.pdf

Wordpress login used for rails application

in order to use the rails login mechanism with another non-php application there are two solutions. for both you need to read the rails login cookie. it is named wordpress_logged_in_XXXX. Where XXXX is a random string. it contains a string of the form: admin|YYYY|ZZZZ. YYYY is the expiration date, in seconds i think, and ZZZZ is a hash calculated from lots of different inputs.

the first solution is to get the cookie from the browser, fetch the different input parameters for the hash from the config-files and the database and compare this hash with the hash given by the cookie. but it’s quite hard to calculate the hash. it uses not only username and password, it uses also salt values from the config file or the database, depending on many configuration options. to make it properly it would be quite hard. but there is an easier, not much more unsecure way. instead of only sending the cookie and it’s value to the browser we could write it also somewhere in a tempfolder to the filesystem. so we can read the cookie from the second application and check it againts the one from the filesystem. instead of the filesystem we could also use memcached to store the cookie infos.

whenever wordpress writes the cookie we have to write it also to the filesystem. this can be done inside the wp_set_auth_cookie function inside the file wp-includes/pluggable.php. but there is another problem. the name of the cookie is the same for different clients. so we have to use another name. i’d propose to use the username of the logged in user. the value which has to be written is stored in the logged_in_cookie -variable inside the wp_set_auth_cookie method.

$user = get_userdata($user_id); file_put_contents('/tmp/wp_rails_'.$user->user_login.'_'.$expiration, $logged_in_cookie);

with the code above the cookies content for each logged in user is writen to a file in the tmp folder. in the rails or whatever application this file can be read and compared to the actual cookie sendt by the browser. if it matches, the user is properly logged into rails.

google geocoding service

google has a great geocoding service. it is possible to find out the latitude and the longitude of an address and the opposite is possible too. for a little project i had to get the coordinates of an address an user entered. if he enters only the city it might be possible that multiple cities exist and the geocoding service returns a list of all cities with details. for example if i search for “stein” (http://maps.google.com/maps/geo?q=stein&key=key) multiple entries in germany an multiple entries in austria are returned. the user is then asked to pick the proper location. i need this project only for germany so i add a postfix to the querystrings like “stein, germany” (http://maps.google.com/maps/geo?q=stein, germany&key=key). now there is only one result for germany instead of the multiples like before. i tried it with other cities and it looks like a general problem. even when i use the gl parameter like &gl=de instead of appending “, germany” to the query string it won’t work properly.

i have no idea why this happens and no idea how to fix this. it’s probably possible to “fix” this problem by using a viewport. so you can instruct to return only cities from the given bounding box. it’s not a really good solution but beter than nothing.

check html templates for spelling errors

my spelling is horribly. it’s quite annoying to update a webpage and getting complaints about some spelling errors. usually all the human language strings are nicely stored in a separate textfile but for smaller projects i often put them directly into html templates. if you have a proper ide it will do the spell checking for you. but if you use some not so sophisticated ide you have to do it yourself or you can use one of the opensource spellers. there are ispell, aspell, myspell and some more. at first i tried to write a shellscript using one of them. then i remembered using one once from python. there is this nice library, pyEnchant. it makes spellchecking really easy. here is a little python script which checks all the html templates. the advantage of this is that it can be implemented as a unittest or something similar, so you will be warned if there is a new spelling error in your project.

the script is quick and dirty and for german. but you should get the idea. from enchant.checker import SpellChecker import os, re, codecs, sys chkr = SpellChecker("de_DE") # patterns remove in this case html and jinja2/ django # code and some special words rmPatterns = [r'<.*?>', r'{%.*?%}', r'{{.*?}}', r'me@norep\.com', u'projectName', u'FooName'] # get a list of directories and subdirectories def listdirs(dirname): dirs = [os.path.join(dirname, f) for f in os.listdir(dirname) / if os.path.isdir(os.path.join(dirname, f))] for d in dirs[:]: dirs += listdirs(d) return dirs for d in listdirs(’templates’): for f in [os.path.join(d, f) for f in os.listdir(d) if re.search(r'\.html$', f)]: # read filedata… as unicode, i always use unicode fd = codecs.open(f, ‘r’, ‘utf-8′) data = fd.read() fd.close() # remove the tags and codes defined in rmPatterns for p in rmPatterns: data = re.sub(p, ”, data) # get error words found = [] chkr.set_text(data) for err in chkr: found.append(err.word) # if errors found, print them if len(found) > 0: print “%s: %s” % (len(found), f) for w in found: print ” : %s -> %s” % (w, ‘, ‘.join(chkr.dict.suggest(w))) the output for one of my projects is: 1: templates/pages/about.html : stösst -> stößt, störst 2: templates/pages/help.html : Registrier -> Registrier-, Registriere, Registriert, Registrieren : Bestätigungs -> Bestätigung, Bestätigungs- 3: templates/pages/legalNotice.html : St -> Set, St., Et, Kt, Sh, SV, Ist, Äst, Ast, Ost, Gst, Sät, So : mail -> Mail, mal, -mail, mail-, mai- : tel -> teil, Gel, Tel., Telex, Teller, Telekom

zine, wordpress killer

i think about a new blog. a nontechnical trashblog or something. for the kerbtier blog i’m using wordpress and i’m quite happy with it. lots of useful plugins and quite stable. but the code is somewhat funny and it’s not so easy to adapt/extend it to ones needs. i often had to change code instead of reconfigure or add code and this makes it time consuming to keep it up to date.

i looked for another blog software and found zine. it’s quite similar to wordpress but written in python. i did a test installation and till now everything went perfect. i didn’t test it properly but only the fact that it’s written in python makes me belive in it much more than in wordpress.

is XUL really cool?

i recently started to develop a little twitter client in XUL. i did a lot of web-development with lots of javascript and i thought that it would be fast and easy to develop a XUL application. now, after a few hours i must say that XUL is a great thing. the available widgets are quite complete, the xml to desribe an UI is properly designed and it’s easy to manipulate the whole thing with javascript. i didn’t have a look at the template stuff yet and i dont know how usable it is. i’m a bit sceptic…

but

a pain in the ass is that javascript frameworks like prototype are only working partially. to manipulate the DOM with the DOM-API you need lots of unelegant lines of code. this would not be a problem if someone wrote a little javascript library for easier DOM-manipulation. but i didn’t found anything. the most javascript libraries are quite browserspecific and are not properly usable by XUL. there is in general almost no community produced stuff like tutorials, examples, proper IDEs… it is a bit frustrating to have almost only the API-reference-like tutorial and the API-reference on the official XUL homepage. firefox and mozilla are such well known. why isn’t there a big XUL echo from the internet?

apache with a segmentation fault

i deployed a python app on a live server and the only output i got was a blank page. there was no error in the virtual hosts log file and it stopped somewhere during the execution of the script. in the apache error log file i found the following line: child pid 15136 exit signal Segmentation fault (11) a very informative and helpful message. after a bit of googling i found out that the gdb would help. gdb is short for GNU Project Debugger. in the file /usr/share/doc/apache2.2-common/README.backtrace there is a short howto to get a stacktrace of a segmentation fault in apache… at least on debian based systems. here is a short overview of what there is to do. at first it’s necessary to install the following packages: apt-get install apache2-dbg libapr1-dbg libaprutil1-dbg gdb add the line CoreDumpDirectory /var/cache/apache2 to your apache config, usually /etc/apache/apache2.conf. after a restart of apache it should now create a memory dump named /var/cache/apache2/core which can be analysed with gdb. it might be necessary to set the maximum size of the coredump like following (inclusive restart of apache): /etc/init.d/apache2 stop ulimit -c unlimited /etc/init.d/apache2 start to analyze the core dump you need to execute the gdb like following: gdb /usr/sbin/apache2 /var/cache/apache2/core (gdb) bt full ... (gdb) quit if you use the threaded mpm (unlikely) then you need to use gdb /usr/sbin/apache2 /var/cache/apache2/core (gdb) thread apply all bt full ... (gdb) quit my dump produced following output: #0 0xb7dd86a5 in free () from /lib/libc.so.6 #1 0xb6675011 in RelinquishMagickMemory () from /usr/lib/libMagick.so.9 #2 0xb6625ba0 in DestroyDrawInfo () from /usr/lib/libMagick.so.9 #3 0xb57d9857 in Magick::Options::~Options () from /usr/lib/libMagick++.so.10 #4 0xb57d6725 in Magick::ImageRef::~ImageRef () from /usr/lib/libMagick++.so.10 #5 0xb57cbfe6 in Magick::Image::~Image () from /usr/lib/libMagick++.so.10 #6 0xb59ed7f3 in boost::python::objects::value_holder::~value_holder () from /var/lib/python-support/python2.5/PythonMagick/_PythonMagick.so #7 0xb581adea in ?? () from /usr/lib/libboost_python-gcc42-1_34_1-py25.so.1.34.1 #8 0xb6ce6f4f in ?? () from /usr/lib/libpython2.5.so.1.0 #9 0×0889a39c in ?? () #10 0xb6d8f7e0 in ?? () from /usr/lib/libpython2.5.so.1.0 #11 0xbf80d088 in ?? () #12 0xb6ce6c60 in ?? () from /usr/lib/libpython2.5.so.1.0 #13 0×00000000 in ?? () i was using pythonMagick which uses Magick++ wich uses ImageMagick. it was a bit irritating that Magick++ version 10 used ImageMagick version 9 instead of 10. after removing ImageMagick version 9 the problem was gone. no idea why it used the wrong version.

image manipulation with python

in webapps you often need to manipulate images. create thumbnails, add shadows or borders, create captchas and many other things. usually i use imagemagick. it’s a very powerfull image manipulation tool with apis for many languages. i often experienced some difficulties when trying to use it with a specific language. there i usually almost no documentation for the apis and it looks like it’s not a lot used. many tend to call the imagemackig executable trough a os call.

how is it with python

i found a few image magick python modules. most were completely undocumented, unfinished or more or less uninstallable (at least on a 64 bit debian). the one who works properly is PythonMagick. i first tried to install the lattest version but due to some dependencies problems it wasn’t possible on my ubuntu system. there is a deb package available which was easyli installed with: sudo apt-get install python-pythonmagick it’s not the latest version but it worked ok, at least for my needs.

how do you use it

i found no documentation for it but it was quite simple to figure out the basic things. there is documentation for Magick++ which is a object oriented wrapper around imagemagick. all the image methods have short descriptions. to add a red 2 pixel border to an image for example you need following code: from PythonMagick import Image i = Image('example.jpg') # reades image and creates an image instance i.borderColor("#ff0000") # sets border paint color to red i.border("2x2") # paints a 2 pixel border i.write("out.jpg") # writes the image to a file the parameter “2×2″ of the border method is a geometry string used by imagemagick as input for many methods. another example who crops the image, flips it, adds an oil paint look and finally save it as a png. i = Image('example.jpg') i.crop("100x100+25+25") i.flip() i.oilPaint(2) i.save('out.png') if i find some spare time i will add a few more complex examples of how to use imagemagick from python.