Mon, 18 Jan 2010
nose 0.11
I know nose 0.11 is old news, but I've only recently discovered it's new multiprocess module.
lmacken@tomservo ~/bodhi $ nosetests
................................................................................................
----------------------------------------------------------------------
Ran 96 tests in 725.111s
OK
lmacken@tomservo ~/bodhi $ nosetests --processes=50
................................................................................................
----------------------------------------------------------------------
Ran 96 tests in 10.915s
OK
Nose 0.11 is already in rawhide, and will soon be in updates-testing.
Note to self (and others): Buy the nose developers beer at PyCon next month
posted at: 16:58 | link | | 2 comments
Thu, 10 Dec 2009
FUDCon Toronto 2009
Another FUDCon is in the books, this time in Toronto. It was great to catch up with many people, put faces to some names, and meet a bunch of new contributors. I gave a session on Moksha, which I'll talk about below, and was also on the Fedora Infrastructure panel discussion.
My goal this FUDCon wasn't to crank out a ton of code, but to focus on gathering and prioritizing requirements and to help others be productive. Here are some of the projects I focused on.
Moksha
Moksha is a project I created a little over a year ago, which is the base of a couple of other applications I've been working on as well: Fedora Community and CIVX. I'll be blogging about these in more detail later.
One of the main themes of FUDCon this year was Messaging (AMQP), and Moksha is a large part of this puzzle, as it allows you to wield AMQP within web applications. During my session the demo involved busting open a terminal, creating a consumer that reacts to all messages, creating a message producer, and then creating a live chat widget -- all of which hooked up to Fedora's AMQP broker.
I'll be turning my slides into an article, so expect a full blog post explaining the basics soon. In the mean time, I found Adam Miller's description to be extremely amusing:
"I walked into a session called "Moksha and Fedora Community -- Real-time web apps with Python and AMQP" which blew my mind. This is Web3.0 (not by definition, but that's what I'm calling it), Luke Macken and J5 completely just stepped over web2.0 and said "pffft, childs play" (well not really but in my mind I assume it went something like that). This session showed off technology that allows real time message passing in a web browser as well as "native" support for standard protocols. The project page is https://fedorahosted.org/moksha/ and I think everyone on the planet should take some time to go there and enjoy the demo, prepare to have your mind blown. Oh, and I also irc transcribed that one as well http://meetbot.fedoraproject.org/fudcon-room-3/2009-12-05/fudcon-room-3.2009-12-05-22.07.log.html ... presentation slides found: http://lmacken.fedorapeople.org/moksha-FUDConToronto-2009.odp"
Fedora Community
So after we released v1.0 of Fedora Community for F12, all of us went off in seperate directions to hack on various things. J5 wrote AMQP javascript bindings, which I then integrated into Moksha. Máirín Duffy built a portable usability lab and has been doing great research on the usability of the project. And I dove back into Moksha to solidify the platform.
After we deploy our AMQP broker for Fedora, and once we have start adding shims into our existing infrastructure, we'll then be able to start creating live widgets and message consumers that can react to events, allowing us to wield Fedora in real-time. This will let us to keep our fingers on the pulse of Fedora, automate and facilitate tedious tasks, and gather metrics as things happen.
During the hackfests I also did some work on our current Fedora Community deployment. Over the past few weeks some of our widgets randomly died, and we haven't been receiving proper error messages. So, I successfully hooked up WebError and the team is now getting traceback emails, which will help us fix problems much faster (or at least nag the hell out of us about them).
I also worked with Ian Weller on the new Statistics section of the dashboard, which has yet to hit production. Ian and I wrote Wiki metrics, Seth Vidal wrote BitTorrent metrics, and I wrote Bodhi metrics. We've also got many more to come. My main concern was a blocker issue that we were hitting with our flot graphs when you quickly bounce between tabs. I ended up "fixing" the bug, so I'll be pushing what we have of the stats branch into production in the near future.
TurboGears2
TurboGears has definitely been our favorite web framework within Fedora's Infrastructure for many years now. TurboGears2, a complete re-invention of itself, has been released recently, and is catching on *very* quickly in the community. Tons of people are working on awesome new apps, and loving every minute of it. I was also able to convert a rails hacker over to it, after he was able to quickly dive into one of the tutorials with ease. See my previous blog post about getting up and running with TG2 in Fedora/EPEL.
python-fedora
One of my main tasks during the hackfests was to pull the authentication layer in Fedora Community that authenticates against the Fedora Account System, and port it over to python-fedora, so we can use it in any TurboGears2 application. I committed the initial port to python-fedora-devel, and have started working on integrating it into a default TG2 quickstart and document the process. There are still a couple of minor things I want to fix/clean up before releasing it, so expect a blog about it soon.
Bodhi
It seems like yesterday that I was an intern at Red Hat working on an internal updates system for Fedora Core. Coming up on 5 years later, and I am now working on my 3rd implementation of an updates system, Bodhi v2.0. What's wrong with the current Bodhi you ask? Well, if you talk to any user of it, you'll probably get a pretty long list. Bodhi is the first TurboGears application written & deployed in Fedora Infrastructure, and uses the vanilla components (SQLObject, kid, CherryPy2). The TG1 stack has been holding up quite nicely over the years, and is still supported upstream, but bodhi's current implemention and design does not make it easy to grow.
Bodhi v2.0 will be implemented in TurboGears2, using SQLAlchemy for an ORM, Mako for templates, and ToscaWidgets2 for re-usable widgets. It will be hook-based and plugin-driven, and will be completely distribution agnostic. Another important goal will be AMQP message-bus integration, which will allow other services or users to react to various events inside of the system as they happen.
So far I've ported the old DB model from SQLObject to SQLAlchemy, and have begun porting the old unit tests, and writing new ones. Come the new year, I'll be giving this much more of my focus.
During the hackfests I got a chance to talk to Dennis Gilmore about various improvements that we need to make with regard to the update push process. It was also great to talk to many different users of bodhi, who expressed various concerns, some of which I've already fixed. I also got a chance to talk to Xavier Lamien about deploying Bodhi for rpmfusion. On the bus ride home I helped explain to Mel how Bodhi & Koji fit into the big picture of things.
During the BarCamp sessions I also attended a session about the Update Experience, where we discussed many important issues surrounding updates.
liveusb-creator
So I got a chance to finally meet Sebastian Dziallas, of Sugar on a Stick fame, and was able to fix a few liveusb-creator issues on his laptop. I ended up pushing out a new release a couple of days ago that contains some of those fixes, along with a new version of Sugar on a Stick.
The liveusb-creator has been catching a lot of press recently (see the front page for a list). Not only did it have a 2 page spread in Linux Format, but it was also featured in this weeks Wired.com article New Sugar on a Stick Brings Much Needed Improvements. Rock.
Python
There was lot of brainstorming done by Dave Malcolm, Colin Walters, Toshio Kuratomi, Bernie Innocenti, I, and many others about various improvements that we could make to the Python interpreter. From speeding up startup time by doing some clever caching to potentially creating a new optimized compiled binary format. We also looked into how WebError/abrt gather tracebacks, and discussed ways of enabling interactive traceback debugging for vanilla processes, without requiring a layer of WSGI middleware.
There was also work done on adding SystemTap probes to Python, which is very exciting. There are many ideas for various probe points, including one that I blogged about previously.
Intel iMac8,1 support
My iMac sucks at Linux. This has been something that has been nagging me for a long time, and I've been slowly trying to chip away at the problems. First, I've been doing work on a Mac port of the liveusb-creator. I also started to work on a kernel patch for getting the EFI framebuffer working, and discussed how to do it with ajax and pjones. The screen doesn't display anything after grub, and since we don't know the base address of the framebuffer, it involves writing code to iterate over memory trying to find some common pixel patterns. I'm still trying to wrap my head around all of it, but I'll probably end up just buying them beer to fix it for me.
Thincrust
Thincrust is a project that I've been excited about for a while, and I actually have some appliances deployed in a production cloud. I was able to run some ideas for various virtual appliances by one of the authors over some beers. Some pre-baked virtual appliances that you can easily throw into a cloud that I would like to see:
- WSGI appliance
- TurboGears2, Pylons, Django, etc.
- Moksha - Real-time web application in a box
- func, certmaster, puppetmaster
- Intrusion detection system
- Many more that I can't think of right now
dogtail
I'm glad to see that dogtail is still exciting people in the community. It still has a lot of potential to improve not only the way we test graphical software, but we also discussed ways of using it to teach people and automate various desktop tasks. What if you logged in after a fresh install and got the following popup bubble:Hi, welcome to Fedora, what can I help you do today?
- Installing new software
- Setting up an email client
- Using and RSS news reader
- More...
Each task would then allow Fedora to take the wheel and walk the user through various steps. I had this idea a while ago, when dogtail first came out, and I still think it would be totally awesome. Anyway, this was not a focus of the hackfests, but merely a conversation that I had while walking to lunch :)
posted at: 11:49 | link | | 0 comments
Thu, 19 Nov 2009
TurboGears2 in Fedora & EPEL
I'm excited to announce that the TurboGears2 web application stack is now available in Fedora 12, 11 and EPEL-5.
What is TurboGears2?
TurboGears 2 is the built on top of the experience of several next generation web frameworks including TurboGears 1 (of course), Django, and Rails. All of these frameworks had limitations which were frustrating in various ways, and TG2 is an answer to that frustration. We wanted something that had:
- Real multi-database support
- Horizontal data partitioning (sharding)
- Support for a variety of JavaScript toolkits, and new widget system to make building ajax heavy apps easier
- Support for multiple data-exchange formats.
- Built in extensibility via standard WSGI components
Installing the TurboGears2 stack & development tools
Fedora 12Fedora 11yum install TurboGears2 python-tg-devtoolsRed Hat Enterprise Linux 5 (with EPEL)yum --enablerepo=updates-testing install TurboGears2 python-tg-devtoolsyum --enablerepo=epel-testing install TurboGears2 python-tg-devtools
Creating your first TG2 app
paster quickstart
Run your test suite
nosetests
Run your application
paster serve development.ini
Read the documentation
http://www.turbogears.org/2.0/docs
Contribute
If you're interested in helping maintain and improve the TG2/Pylons stack within Fedora/EPEL, please let me know. We're always looking for new Python hackers to join the team. There are still a few more components that need to be packaged and reviewed (eg: chameleon.genshi), so please take a look at the TurboGears2 page on the Fedora wiki for more details..
posted at: 00:00 | link | | 1 comments
Sun, 08 Nov 2009
New liveusb-creator release!
So I've gotten some pretty inspiring feedback from various users of the
liveusb-creator recently, so I decided to put some cycles into it this weekend and
crank out another release.
"As a non-Linux person, Live-USB Creator has improved the quality of my life measurably!" --Dr. Arthur B. Hunkins
Yesterday I released version 3.8.6 of the liveusb-creator. Changes in this
release include:
- Added the F12 beta release
- Updated to the latest Sugar on a Stick v2 beta snapshot (#522240)
- Made our automatic device detection code more robust (#519134)
- Fixed encoding of unicode strings from exceptions (#471367)
- Made our Linux device detection more robust (#517053)
- Intel Mac EFI directory preparation (#526825) thanks to Matt Domsch
- Made our windows device detection more robust
- Added a --device-checksum options, which calculates the checksum of the entire device.
- Added a --liveos-checksum option, which takes the checksum of all LiveOS files, and then generate a checksum of the checksums
- Added a --hash option for configuring the hash for the above checksum features
- Made the LiveUSBCreator.bootable_partition method a little more robust
- Better handling of file descriptors
- Some Windows-specific optimizations & fixes
- Fixed a bug with the overlay size on sticks with not much free space
- Handle device paths containing spaces when running extlinux (#490843)
- Remove some duplicate po files (#516841)
- Many translation updates
https://fedorahosted.org/releases/l/i/liveusb-creator/liveusb-creator-3.8.6.zip
Fedora
https://admin.fedoraproject.org/updates/liveusb-creator-3.8.6-1.fc11
https://admin.fedoraproject.org/updates/liveusb-creator-3.8.6-1.fc12
Source
https://fedorahosted.org/releases/l/i/liveusb-creator/liveusb-creator-3.8.6.tar.bz2
Trac
http://liveusb-creator.fedorahosted.org
posted at: 20:39 | link | | 1 comments
Tue, 13 Oct 2009
Good Python Habits: vim + pyflakes
Here is a neat little hack for running pyflakes on Python files after you save them. I like using pyflakes for quickly catching dumb errors, but you could easily replace it with a more comprehensive tool like pychecker, or pylint for more strict PEP8 compliance.
All you have to do is throw this in your ~/.vimrc
au BufWritePost *.py !pyflakes %
This has saved me *tons* of time and frustration over the past few weeks, and I have no idea I lived without it.
posted at: 08:32 | link | | 3 comments
Sun, 14 Dec 2008
>>> from fedora.client import Wiki
I created a simple Python API for interacting with Fedora's MediaWiki a while back, in an attempt to gather various metrics. I just went ahead and committed it to the python-fedora modules. Here is how to use it:
>>> from fedora.client import Wiki
>>> wiki = Wiki()
>>> wiki.print_recent_changes()
From 2008-12-07 20:59:01.187363 to 2008-12-14 20:59:01.187363
500 wiki changes in the past week
== Most active wiki users ==
Bbbush............................................ 230
Konradm........................................... 25
Duffy............................................. 22
Jreznik........................................... 21
Ianweller......................................... 14
Jjmcd............................................. 14
Geroldka.......................................... 10
Gdk............................................... 9
Anouar............................................ 7
Gomix............................................. 6
== Most edited pages ==
Features/KDE42.................................... 21
SIGs/SciTech/SAGE................................. 15
FUDCon/FUDConF11.................................. 14
Special:Log/upload................................ 13
How to be a release notes beat writer............. 12
Special:Log/move.................................. 11
Design/SETroubleshootUsabilityImprovements........ 10
PackageMaintainers/FEver.......................... 9
User:Gomix........................................ 6
Zh/主要配置文件..................................... 5
>>> for event in wiki.send_request('api.php', req_params={
... 'action': 'query',
... 'list': 'logevents',
... 'format': 'json',
... })['query']['logevents']:
... print '%-10s %-15s %s' % (event['action'], event['user'], event['title'])
...
patrol Ianweller User:Ianweller/How to create a contributor business card
move Nippur REvanderLuit
patrol Ianweller Project Leader
move Ianweller FPL
upload Anouar Image:AnouarAbtoy.JPG
move Liangsuilong ZH/Docs/FetionOnFedora
move Liangsuilong FetionOnFedora
patrol Ianweller User:Ianweller
It uses the fedora.client.BaseClient, which is a class that simplifies interacting with arbitrary web services. Toshio and I created it a while back as a the core client for talking with our various TurboGears-based Fedora Services (bodhi, pkgdb, fas, etc.), but it has now seemed to morph into a much more flexible client for talking JSON with web applications.
from datetime import datetime, timedelta
from collections import defaultdict
from fedora.client import BaseClient
class Wiki(BaseClient):
def __init__(self, base_url='http://fedoraproject.org/w/', *args, **kwargs):
super(Wiki, self).__init__(base_url, *args, **kwargs)
def get_recent_changes(self, now, then, limit=500):
""" Get recent wiki changes from `now` until `then` """
data = self.send_request('api.php', req_params={
'list' : 'recentchanges',
'action' : 'query',
'format' : 'json',
'rcprop' : 'user|title',
'rcend' : then.isoformat().split('.')[0] + 'Z',
'rclimit' : limit,
})
if 'error' in data:
raise Exception(data['error']['info'])
return data['query']['recentchanges']
def print_recent_changes(self, days=7, show=10):
now = datetime.utcnow()
then = now - timedelta(days=days)
print "From %s to %s" % (then, now)
changes = self.get_recent_changes(now=now, then=then)
num_changes = len(changes)
print "%d wiki changes in the past week" % num_changes
users = defaultdict(list) # {username: [change,]}
pages = defaultdict(int) # {pagename: # of edits}
for change in changes:
users[change['user']].append(change['title'])
pages[change['title']] += 1
print '\n== Most active wiki users =='
for user, changes in sorted(users.items(),
cmp=lambda x, y: cmp(len(x[1]), len(y[1])),
reverse=True)[:show]:
print ' %-50s %d' % (('%s' % user).ljust(50, '.'), len(changes))
print '\n== Most edited pages =='
for page, num in sorted(pages.items(),
cmp=lambda x, y: cmp(x[1], y[1]),
reverse=True)[:show]:
print ' %-50s %d' % (('%s' % page).ljust(50, '.'), num)
I added a Wiki.login method to the latest version, but it isn't quite working yet. This is due to some minor limitations in the ProxyClient, so we currently cannot handle authenticated requests. However, this shouldn't be very difficult to implement. The reason for this is that we need to be able to run authenticated queries as a 'bot' account in order to mitigate the 500 entry API return limit.
This module makes it easy to talk to MediaWiki's API, so if you do anything cool with it feel free to send patches here. It's currently not being shipped in a python-fedora release, so you'll have to grab the code from Bazaar:
bzr branch bzr://bzr.fedorahosted.org/bzr/python-fedora/python-fedora-devel
posted at: 17:12 | link | | 4 comments
Sat, 13 Dec 2008
Time spent in updates-testing purgatory
Will Woods asked me on IRC earlier today how easy it would be to determine the amount of time Fedora updates spend in testing within bodhi. It turned out to be fairly easy to calculate, so I thought I would share the code and results.
from datetime import timedelta
from bodhi.model import PackageUpdate
deltas = []
occurrences = {}
accumulative = timedelta()
for update in PackageUpdate.select():
for comment in update.comments:
if comment.text == 'This update has been pushed to testing':
for othercomment in update.comments:
if othercomment.text == 'This update has been pushed to stable':
delta = othercomment.timestamp - comment.timestamp
deltas.append(delta)
occurrences[delta.days] = occurrences.setdefault(delta.days, 0) + 1
accumulative += deltas[-1]
break
break
deltas.sort()
all = PackageUpdate.select().count()
percentage = int(float(len(deltas)) / float(all) * 100)
mode = sorted(occurrences.items(), cmp=lambda x, y: cmp(x[1], y[1]))[-1][0]
print "%d out of %d updates went through testing (%d%%)" % (len(deltas), all, percentage)
print "mean = %d days" % (accumulative.days / len(deltas))
print "median = %d days" % deltas[len(deltas) / 2].days
print "mode = %d days" % mode
4878 out of 10829 updates went through testing (45%)
mean = 17 days
median = 11 days
mode = 6 days
So, it seems that the majority of updates leave updates-testing in less than a week. This is interesting when taking into consideration the testing workflow mechanisms that bodhi employs. An update can go from testing to stable in two ways: 1) The update's karma can reach an optional stable threshold, and automatically get pushed to the stable repository based on positive community feedback. 2) The developer can request that the update be marked as stable. After an update sits in testing for two weeks, bodhi will send the developer nagmail, which seems to help mitigate stale updates. When initially deploying bodhi, I thought that we would get bogged down with a ton of stale testing updates and would have to implement a timeout to have them automatically get marked as stable. This is still a viable option (which would require FESCo rubberstamping), but I'm quite surprised to see how effective this community-driven workflow is already. Now we just need to encourage more people to use it :)
Due to the limitations of the current model I couldn't figure out an easy way to determine which updates were marked as stable by positive community feedback. This issue will be assessed with the long-awaited SQLAlchemy port that I will hopefully finish up at some point early next year.
posted at: 02:13 | link | | 1 comments
Wed, 16 Jul 2008
Python dictionary optimizations
In my recent journey through the book Beautiful Code, I came across a chapter devoted to Python's dictionary implementation. I found the whole thing quite facinating, due to the sheer simplicity and power of the design. The author mentions various special-case optimizations that the Python developers cater for in the CPython dictionary implementation, which I think are valuable to share.
Key lookups
In CPython, all PyDictObject's are optimized for dictionaries
containing only string keys. This seems like a very common use case that is
definitely worth catering for. The key lookup function pointer looks like this:
struct PyDictObject {
PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key, long hash);
...
ma_lookup is initially set to the lookdict_string function (renamed to lookdict_unicode in 3.0), which assumes that both the keys in the dictionary and the key being searched for are standard PyStringObject's. It is then able to make a couple of optimiziations, such as mitigating various error checks, since string-to-string comparison never raise exceptions. There is also no need for rich object comparisons either, which means we avoid calling PyObject_RichCompareBool, and always use _PyString_Eq directly.
This string-optimized key lookup function is utilized until you search for a non-string key. When lookdict_string detects this, it permanently changes the ma_lookup function to a slower, more generic lookdict function. Here is an example of how to trigger this degradation:
>>> d = {'foo': 'bar'}
>>> d.get(1) # Congratulations, your dictionary is now slower...
>>> d.get(u'foo') # Yes, even unicode objects trigger this degradation as well
Jython does not contain this optimization, however, it does have a string-specialized map object, org.python.core.PyStringMap, which is used for the __dict__ underpinning of all class instances and modules. User code that creates a dictionary utilizes a different class, org.python.core.PyDictionary, which is a heavyweight object that uses the java.util.Hashtable along with some extra indirection, allowing it to be subclassed.
Small dictionaries
Python's dictionary makes an effort to never be more than 2/3rds full. Since the
default size of dict is 8, this allows you to have 5 active entries in your
dict while avoiding an additional malloc. Dictionaries used for keyword
arguments are usually within this limit, and thus are fairly efficient (along
with the fact that they most likely come from a pool of cached unused dicts).
This also can help improve cache locality. For example, the PyDictObject
structure uses 124 bytes of space (on x86 w/gcc) and therefore can fit into two
64-byte cache lines.
So, the moral of the story: use dictionaries with string-only keys, and only look for string keys within them. If you can keep them small enough to avoid the extra malloc (<= 5), bonus. As expected, things get better in Python 3.0, as unicode keys will no longer slow your dictionary down.
posted at: 10:39 | link | | 0 comments
Mon, 24 Mar 2008
PyCon 2008
I was in Chicago last week for PyCon 2008. It was my first time in the windy city, and I must say that I was thoroughly impressed. As expected in any city, we got a chance to see a lady get her purse snattched, and a mentally unstable gentleman on the train yelling profanities at god. Anyway, the conference itself was extremely well done, and tons of awesome innovation happened at the sprints afterwords.
Day 1: Tutorials
8+ hours of TurboGears/Pylons/WSGI tutorials. Awesome. I'm really
excited with what is in the works for TurboGears2. By wielding Pylons, the
TG2 team was able to completely re-write their framework with minimal amounts
of code, while at the same time, gaining a *ton* of new features
and some amazing middleware. Mark Ramm and Ben Bangert took turns walking us through the
deep internals of their frameworks, while also giving some examples how to use
them.
Sessions
During the 3-day conference portion of PyCon, there was a vast plethora of
incredibly interesting sessions and conversations. You can find a schedule of
the talks and some slides here. Everything was
video taped as well, so the sessions should be making their way on to YouTube
hopefully at some point soon.
Here are some things that caught my attention while I was there.
WSGI
Defined by Phillip J. Eby in PEP-333, the Web Server Gateway Interface is a simple interface between web servers, applications, and frameworks. Or, as explained by Ian Bicking: WSGI is a series of Tubes. The basic idea is that it lets you connect a bunch of different applications together into a functioning whole.
Since TurboGears2 is based on Pylons, it will be a full blown WSGI application out the box, loaded with lots of useful middleware (WebError, Routes, Sessions, Caching, etc), and will allow you to use any WSGI server that you wish (Paste, CherryPy, orbited, mod_wsgi, etc).
An example of a basic Hello World WSGI application:
def wsgi_app(environ, start_response):
start_response('200 OK', [('content-type', 'text/html')])
return ['Hello world!']
So, what is WSGI middleware? Well, it's essentially the WSGI equivalent of a python decorator, but instead of wrapping one function in another, you're wrapping one web-app in another. You can see a list of some existing WSGI middleware here.
virtualenv
With so many new shiny python programs to play with, I really tried to resist
the urge to easy_install everything into my global Python site-packages so I
could tinker with things. This is generally a Bad Thing in a distribution, as
easy_install not only installs things behind your package managers back,
but it also lacks the ability to uninstall anything with it, unless you want to take Zed's easy_fucking_uninstall
approach ;) During the TurboGears tutorial, I was introduced to a tool call
virtualenv, which will setup a virtual python environment in which you can
easy_install as many eggs as you want without worrying about butchering
your site-packages.
$ easy_install virtualenv
$ virtualenv --no-site-packages foo
$ cd foo; source bin/activate
$ easy_install <shiny python programs>
nose
I've been in love with nose since day
one, but realized that I haven't been utilizing it to it's fullest abilities.
I blogged in the past about nose's
profiler plugin. Come to find out, nose offers a lot more plugins that can
seriously help make your life easier:
$ nosetests --pdb --pdb-failures
.............................................................> /home/lmacken/tg1.1/turbogears/turbogears/identity/tests/test_visit.py(92)test_cookie_permanent()
-> assert abs(should_expire - expires) < 3
(Pdb) locals()
{'morsel': <Morsel: tg-visit='452c94de3900fc2adff2cd6b0b0f04c4533e3e9e'>, 'self': <turbogears.identity.tests.test_visit.TestVisit testMethod=test_cookie_permanent>, 'expires': 1206228604.0, 'should_expire': 1206232205.0, 'permanent': False}
(Pdb)
You can also measure code coverage during your unit test execution using the '--with-coverage' option, which utilizes coverage.py.
SQLAlchemy
Also known as "the greatest object-relational-mapper created for any language. ever.", 0.4 has seen vast improvements since 0.3. Among them, a new declarative
API is now available that essentially lets you define your class, Table and
mapper constructs "at once" under a single class declaration (giving you a
similar ActiveMapper feel like SQLObject or Elixir).
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine('sqlite://')
Base = declarative_base(engine)
class SomeClass(Base):
__tablename__ = 'some_table'
id = Column('id', Integer, primary_key=True)
name = Column('name', String(50))
Unicode, demystified.
By far, the most frustrating problems I've ever encountered in Python have been
unicode related. I was fortunate enough to catch Kumar McMillan's
presentation, "Unicode in Python, Completely Demystified". This presentation
helped enlighten many on the concept of unicode, clear up many misconceptions,
and explain how to handle it properly in Python. Check out his slides for more details, but
the general idea here is to follow these three rules:
- decode early
- unicode everywhere
- encode late
def to_unicode_or_bust(obj, encoding='utf-8'):
if isinstance(obj, basestring):
if not isinstance(obj, unicode):
obj = unicode(obj, encoding)
return obj
Later that night I went and shined some light on some dark corners of certain projects that I've been working on to try and handle unicode the Right Way.
Grassyknoll
After the code sprints, I got a chance to see these guys show off their hard
work. grassyknoll is a
search engine written in Python. With the ability to handle multiple backends,
frontends, and wire formats, grassyknoll has a ton of potential to
revolutionize the open source search engine. There has been recent talk in
Fedora land about what kind of search engine to use, and I think grassyknoll is
definitley a viable option.
Packaging BOF
Toshio, Spot, and I attended a Packaging BOF where we discussed our
experiences with distutils and setuptools with a bunch of people from various
companies and distros. This then sparked discussions on python-dev and the
distutils-sig mailing lists. You can also find the details of the BOF session
on the Python wiki. There
is definitely a lot of energy behind this, so hopefully we'll see some good changes
in setuptools in the near future that will make our lives as distro packagers much easier :)
Orbited
Orbited is an HTTP daemon that is optimized for long-lasting comet
connections. This allows you to write real-time web applications with
ease. For example, embeding an irc channel anywhere:
You can also use orbited as a WSGI server! Toshio did some brief benchmarking of of CherryPy{2,3}, Paste, and Orbited WSGI servers, and orbited seemed to be the clear winner in all scenerios. There is a good chance that we will be using orbited to handle our comet widgets within MyFedora :)
Code SprintsI stayed the entire time for the code sprints, and mainly focused on TurboGears hacking. This is what I ended up working on:
- Added SQLAlchemy support to turbogears.testutil.DBTest (Ticket #1764). When you inherit from this class, it will automatically set up and tear down your SQLObject or SQLAlchemy database before and after each of your unit tests.
- Added a FlotWidget using ToscaWidgets to twTools This widget allows you to create attractive graphs with ease.
- Made the TurboGears2 templating engine configurable (Ticket #1680). Things were hardcoded to use genshi; this is no longer the case.
- WebTest integration for unit test (Ticket #1762). I wrote a some high level unit testing classes that wrap a WebTest object around your WSGI app. This gives you an extremely powerful API to write "framework independent" unit tests. The WebTest.get/post methods simply return WebOb objects, which allow for drastic simplification of your unittests. This also helped decouple the TG testutils from using CherryPy internals (one step closer to CherryPy3 support in TurboGears). As I mentioned on the TurboGears-trunk list, these changes will make writing unit tests a breeze:
class TestPages(testutil.DBWebTest):
def test_forbidden(self):
self.app.get('/hot_action', status=403)
def test_webob_response(self):
user = User(user_name=u"test", password=u"test")
self.login_user(user)
res = self.app.get('/hot_action')
assert "Hot WSGI action" in res
assert res.namespace['tg_flash'] == u'Hot WSGI action'
The WebTest integration is planned to hit in the TurboGears 1.1 release, deprecating testutils.{call,create_request}.
Want to read more blog posts about PyCon 2008? You can find links to lots of PyCon related posts here and on Planet Python.
posted at: 17:05 | link | | 1 comments
Wed, 19 Dec 2007
TurboFlot 0.0.1
In an effort to clean up bodhi's metrics code a bit, I wrote a TurboFlot plugin that allows you to wield the jQuery plugin flot inside of TurboGears applications. The code is quite trivial -- it's essentially just a TurboGears JSON proxy to the jQuery flot plugin. Breaking this code out into it's own widget makes it really easy to generate shiny graphs in a Pythonic fashon, without having to write a line of javascript.

Check out the README to see the code for the example above.
To use TurboFlot in your own application, you just pass your data and graph options to the widget, and then throw it up to your template. Read the flot API documentation for details on all of the arguments. Here is a simple usage example:
flot = TurboFlot([
{
'data' : [[0, 3], [4, 8], [8, 5], [9, 13]],
'lines' : { 'show' : True, 'fill' : True }
}],
{
'grid' : { 'backgroundColor' : '#fffaff' },
'yaxis' : { 'max' : '850' }
}
)
Then, to display the widget in your template, you simply use:
${flot.display()}
The code for the widget itself is pretty simple. It just takes your data and graph options, encodes them as JSON and tosses them at flot.
class TurboFlot(Widget):
"""
A TurboGears Flot Widget.
"""
template = """
<div xmlns:py="http://purl.org/kid/ns#" id="turboflot"
style="width:${width};height:${height};">
<script>
$.plot($("#turboflot"), ${data}, ${options});
</script>
</div>
"""
params = ["data", "options", "height", "width"]
javascript = [JSLink('turboflot', 'excanvas.js'),
JSLink("turboflot", "jquery.js"),
JSLink("turboflot", "jquery.flot.js")]
def __init__(self, data, options={}, height="300px", width="600px"):
self.data = simplejson.dumps(data)
self.options = simplejson.dumps(options)
self.height = height
self.width = width
You can download the latest releases from the Python Package Index:
http://pypi.python.org/pypi/TurboFlotOr you can grab my latest development tree out of mercurial:
http://hg.lewk.org/TurboFlotAs always, patches are welcome :)
posted at: 14:21 | link | | 1 comments
Sat, 08 Dec 2007
Fedora update metrics
Using flot, a plotting library for jQuery, I threw together some shiny metrics for bodhi. It's pretty amazing to see how a Fedora release evolves over time, with almost as many enhancements as bugfixes. This could arguably be a bad thing, as our "stable" bits seem to change so much; but it definitely shows how much innovation is happening in Fedora.
I should also note that the data on the graphs may look different than the numbers you see next to each category in the bodhi menu. This is due to the fact that updates may contain multiple builds, and the graphs account for all builds in the system.
When I get some free cycles I'd like to generate some metrics from the old updates system for FC4-FC6. I can imagine that the differences will be pretty drastic, considering how the old updates tool was internal to Red Hat, and that the majority of our top packagers are community folks.
posted at: 19:05 | link | | 0 comments
Mon, 01 Oct 2007
Use your Nose!
Every programmer out there [hopefully] knows that unittests are an essential part of any growing body of code, especially in the open source world. However, most hackers out either never write test cases (let alone comments), or usually put them off until "later" (aka: never). Having to deal with Java and JUnit tests in college not only made me not want to write unit tests, but it made me want to kill myself and everyone around me. Thankfully, I learned Python.
So, I just happen to maintain a piece of software in Fedora called nose (which lives in the python-nose package). Nose is a discovery-based unittest extension for Python, and is also a part of the TurboGears stack. If you're hacking on a TurboGears project, the turbogears.testutil module provides some incredibly useful features that make writing tests powerfully trivial.
For example, in the code below (taken from bodhi), I create a test case that utilizes a fresh SQLite database in memory. Inheriting from the the testutil.DBTest parent class, this database will be created and torn down automagically before and after each test case is run -- ensuring that my tests are executed in complete isolation. With this example, I wrote a test case to ensure that unauthenticated people cannot create a new update.
import urllib, cherrypy
from turbogears import update_config, database, testutil, url
update_config(configfile='dev.cfg', modulename='bodhi.config')
database.set_db_uri("sqlite:///:memory:")
class TestControllers(testutil.DBTest):
def test_unauthenticated_update(self):
params = {
'builds' : 'TurboGears-1.0.2.2-2.fc7',
'release' : 'Fedora 7',
'type' : 'enhancement',
'bugs' : '1234 5678',
'cves' : 'CVE-2020-0001',
'notes' : 'foobar'
}
path = url('/save?' + urllib.urlencode(params))
testutil.createRequest(path, method='POST')
assert "You must provide your credentials before accessing this resource." in cherrypy.response.body[0]
In the above example, the TestControllers class is automatically detected by nose, which then executes each method that begins with the word 'test'. To run your unittests, just type 'nosetests'.
[lmacken@tomservo bodhi]$ nosetests
.................................
----------------------------------------------------------------------
Ran 33 tests in 16.798s
OK
Now, for the fun part. Nose comes equipped with a profiling plugin that will profile your test cases using Python's hotshot module.
So, I went ahead and added a 'profile' target to bodhi's Makefile:
profile:
nosetests --with-profile --profile-stats-file=nose.prof
python -c "import hotshot.stats ; stats = hotshot.stats.load('nose.prof') ; stats.sort_stats('time', 'calls') ; stats.print_stats(20)"
Now, typing 'make profile' will execute and profile all of our unit tests, and spit out the top 20 method calls -- ordered by internal time and call count.
[lmacken@tomservo bodhi]$ make profile
nosetests --with-profile --profile-stats-file=nose.prof
.................................
----------------------------------------------------------------------
Ran 33 tests in 42.878s
OK
python -c "import hotshot.stats ; stats = hotshot.stats.load('nose.prof') ; stats.sort_stats('time', 'calls') ; stats.print_stats(20)"
800986 function calls (702850 primitive calls) in 42.878 CPU seconds
Ordered by: internal time, call count
List reduced from 3815 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
14 13.675 0.977 13.675 0.977 /usr/lib/python2.5/socket.py:71(ssl)
31 10.683 0.345 10.683 0.345 /usr/lib/python2.5/httplib.py:994(_read)
2478/2429 9.297 0.004 9.677 0.004 :1()
1 0.604 0.604 0.604 0.604 /usr/lib/python2.5/commands.py:50(getstatusoutput)
2999 0.536 0.000 0.539 0.000 /usr/lib/python2.5/site-packages/sqlobject/sqlite/sqliteconnection.py:177(_executeRetry)
105899 0.448 0.000 0.773 0.000 Modules/pyexpat.c:871(Default)
60 0.327 0.005 1.102 0.018 /usr/lib/python2.5/site-packages/kid/parser.py:343(_buildForeign)
105899 0.325 0.000 0.325 0.000 /usr/lib/python2.5/site-packages/kid/parser.py:452(_default)
3396 0.280 0.000 0.420 0.000 /usr/lib/python2.5/site-packages/cherrypy/config.py:107(get)
2965 0.263 0.000 0.263 0.000 /usr/lib/python2.5/logging/__init__.py:364(formatTime)
44964/6587 0.238 0.000 0.252 0.000 /usr/lib/python2.5/site-packages/kid/parser.py:156(_pull)
60 0.116 0.002 0.116 0.002 /usr/lib/python2.5/site-packages/kid/compiler.py:38(py_compile)
8127 0.114 0.000 0.114 0.000 /usr/lib/python2.5/site-packages/cherrypy/_cputil.py:311(lower_to_camel)
8982 0.110 0.000 0.137 0.000 /usr/lib/python2.5/site-packages/sqlobject/dbconnection.py:902(__getattr__)
13740/4044 0.108 0.000 2.176 0.001 /usr/lib/python2.5/site-packages/kid/parser.py:209(_coalesce)
24353/4026 0.107 0.000 2.143 0.001 /usr/lib/python2.5/site-packages/kid/parser.py:174(_track)
3170 0.093 0.000 0.398 0.000 /usr/lib/python2.5/logging/__init__.py:405(format)
1 0.082 0.082 0.082 0.082 /usr/lib/python2.5/site-packages/rpm/__init__.py:5()
4777 0.081 0.000 1.320 0.000 /usr/lib/python2.5/site-packages/kid/serialization.py:564(generate)
759/176 0.074 0.000 0.210 0.001 /usr/lib/python2.5/sre_parse.py:385(_parse)
posted at: 09:40 | link | | 3 comments
Sat, 01 Sep 2007
Recovering a Pyblosxom blog using liferea's RSS cache
My buddy who used to host lewk.org didn't pay his bills, so his server got taken down last week. What sucks is I that never backed up my Pyblosxom data. What doesn't suck is that thankfully Liferea, my RSS reader, did for me.
Grepping through ~/.liferea_1.2/cache/feeds, I was able to find my blog cached in some XML format. Then I wrote a little bit of code to re-create my Pyblosxom entry structure with the proper filenames and timestamps.
#!/usr/bin/python -tt
"""
Turns XML into pyblosxom blog entries.
It parses BLOG_XML pulling out blog entires in the form of:
<feed version="1.1">
<item>
<title></title>
<description></description>
<source>http://foo.com/blog/2007/08/20/bar.html</source>
<time>1187621268</time>
</item>
</feed>
The file '2007/08/20/bar.txt' will be created in pyblosxom format with
the appropriate timestamp. The #mdate is used by the pyblosxom.vim plugin.
title
#mdate Aug 20 10:47:48 2007
<p>description</p>
"""
import os
import time
try: from xml.etree import cElementTree
except ImportError: import cElementTree
iterparse = cElementTree.iterparse
entries = {} # { 'title' : <Element> }
BLOG_XML = 'blog.xml'
BLOG_ROOT = 'http://foo.com/blog/'
def getField(elem, field):
for child in elem:
if child.tag == field:
return child.text
## Pull out all feed items, removing older duplicates
for event, elem in iterparse(BLOG_XML):
if elem.tag == 'feed':
for child in elem:
if child.tag == 'item':
title = getField(child, 'title')
if entries.has_key(title):
if int(getField(child, 'time')) > \
int(getField(entries[title], 'time')):
entries[title] = child
else:
entries[title] = child
for title, entry in entries.items():
source = getField(entry, 'source').replace(BLOG_ROOT, '')
source = source.replace('.html', '.txt')
if not os.path.isdir(os.path.dirname(source)):
os.makedirs(os.path.dirname(source))
output = file(source, 'w')
output.write(title + '\n')
mtime = time.localtime(int(getField(entry, 'time')))
mdate = time.strftime("%b %e %H:%M:%S %Y", mtime)
output.write("#mdate %s\n" % mdate)
output.write("<p>%s</p>\n" % getField(entry, 'description'))
output.close()
timestamp = time.strftime("%y%m%d%H%M", mtime)
os.system("touch -t %s %s" % (timestamp, source))
It also adds an #mdate tag into each entry, which read by the spiffy pyblosxom mdate vim hack that Jordan Sissel wrote to restore each entries original timestamp after editing. His code only works on FreeBSD at the moment, so I started a pyblosxom.vim plugin that works on Linux (hopefully it will eventually support both, along with a bunch of other handy functions). You can find all of this code in my mercurial repo: hg.lewk.org/xml2pyblosxom
posted at: 11:44 | link | | 30 comments
Sat, 19 May 2007
Security LiveCD
So last week I created an initial version of a potential Fedora Security LiveCD spin. The goal is to provide a fully functional livecd based on Fedora for use in security auditing, penetration testing, and forensics. I created it as a bonus project for my Security Auditing class (instead of following the 5-pages of instructions on how to create a Gentoo livecd that she handed out (mad props to davidz for creating an amazing LiveCD tool)), but it has the potential to be extremely useful and also help increase the number and quality of Fedora's security tools. I threw in all of the tools I could find that already exist in Fedora, but I'm sure I'm missing a bunch, so feel free to send patches or suggestions. I also added a Wishlist of packages that I would eventually like to see make their way in Fedora, after the core->extras merge reviews are done.
I would eventually like to see Fedora offer a LiveCD that puts all of the existing linux security livecds to shame. We have quite a ways to go, but this is a start. I'm taking a computer forensics class next quarter, so I will be expanding it to fit the needs of our class as well.
posted at: 14:15 | link | | 0 comments
Wed, 14 Feb 2007
break
So my Thanksgiving break was far from a break. I spent a couple of days last week at Red Hat's westford office before heading back up to RIT to start a new quarter. In my two days in the office I was able to touch base with a bunch of people, and get a bunch of stuff done as well. I had a long discussion with dmalcom about integrating the Fedora Updates System with Beaker/TableCloth. He also gave me a quick rundown on a bunch of the Red Hat QA infrastructure that is currently being used. Ideally we'd like to be able to crunch all package updates through an automated test system before pushing them out to the world. Involvement needed: FedoraTesting.
Later that day I met with jrb and jkeating about getting a package updating system in place for a new Red Hat product that is going out the door very soon. This means that much work will be going into the new UpdatesSystem in the near future, which means I get to dig deeper into the world of TurboGears :)
On thursday I cranked a bunch of code out, but was fairly distracted most of the time by the OLPC laptops that were lying around the office. I must say, it is an absolutely incredible machine. The screen is gorgeous, and it's camera is very impressive. I hung around later at the office for an OLPC hackfest that was going down.
|
|
I was busy working on the updates system most of the time, but then later on I started looking into some Python start-up issues, which can be seen by doing:
You'll notice a ton of syscalls like the following, which try to open/stat modules in locations that do not exist:
strace python 2>&1 | grep ENOENT
stat64("/usr/lib/python24.zip/posixpath", 0xbfdb5094) = -1 ENOENT (No such file or directory)
PrivoxyWindowOpen("/usr/lib/python24.zip/posixpath.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
PrivoxyWindowOpen("/usr/lib/python24.zip/posixpathmodule.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
PrivoxyWindowOpen("/usr/lib/python24.zip/posixpath.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
PrivoxyWindowOpen("/usr/lib/python24.zip/posixpath.pyc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (N o such file or directory)
stat64("/usr/lib/python2.4/posixpath", 0xbfdb5094) = -1 ENOENT (No such file or directory)
PrivoxyWindowOpen("/usr/lib/python2.4/posixpath.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No su ch file or directory)
PrivoxyWindowOpen("/usr/lib/python2.4/posixpathmodule.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
PrivoxyWindowOpen("/usr/lib/python2.4/posixpath.py", O_RDONLY|O_LARGEFILE) = 5
So it's obvious that modules could exist in multiple locations, but if you are repeatedly going to check a series of directories, such as /usr/lib/python24.zip, wouldn't it be a *bit* smarter to check if they exists first, and then avoid checking there in the future? Doing so would help cut down from the 233+ syscalls python makes while starting up looking for modules. I really don't have any free cycles to try and add some sense into Python, so I really hope someone can beat me to a patch.
TurboGears 1.0b2

I came back home to find the new TurboGears book in my mailbox, which has been extremely informative, aside from the fact that the project has awesome online docs as well. I pushed out the latest TurboGears release, 1.0b2, for FC6 and rawhide yesterday as well.
posted at: 21:12 | link | | 0 comments










