Tuesday, September 30, 2014

PyCon UK 2014

Hello, I'm never going to remember everything until next Monday, so have a braindump. I'll put up an item for questions in that meeting. These are the talks I went to, I've linked slides for some but not all. Note PyPy is an alternative Python interpreter; PyPi is where pips come from; PiPy is a Raspberry Pi lib; and if you name your next lib PiPi I will hurt you. --- Keynote: Ecosystem threats to Python (Van Lindberg) Java, JS and Go. These are threats because Python has poor interop with them. He did mention it's not a zero-sum game. There wasn't a rallying cry or solution offered, Van just called these out. There was a lot of instinctive dislike of Java. It got called out as a 90s ancient and enterprise-y. There didn't seem to be much understanding of why people might choose the JVM as a platform. Scala got a brief mention at least. Java is the native language for Android, I can't remember if Van mentioned this but Python is shut out of mobile right now. JS was lumped with node.js. CoffeeScript, PureScript etc weren't mentioned. I don't remember much beyond a few cracks really. There was some admiration for Go, particularly the single-executable deployment story. Speaking to Michael Foord (Go dev at Canonical) afterwards, Go bucks the trend on exception handling and several libs try to fix that. The performance is good though - Van has a fork of Swift with pieces rewritten in Go and he says it's 3x faster. --- HTTP/2: because the Web was too easy (Cory Benfield) https://t.co/Eq7WZ96wDI - slides Semi-recent developments in web dev have included CSS spriting, optimising load time for 'above the fold' and aggressive minification + concatenation of resources. Many of these are essentially hacks to mitigate the misuse of TCP connections by HTTP. TCP connections are intended to stay open for minutes or at least seconds, not milliseconds. Not only do these hacks complicate the toolchain, they make caching more difficult. HTTP/2 offers great latency reduction by using a single connection for multiple resources. It also allows servers to push resources you didn't yet ask for (EG CSS, JS) so they're in your cache once you've parsed the HTML. The requests lib already supports HTTP/2, so from an OpenStack perspective, when we want to switch it should be fairly easy. However it is ~95% SPDY which wraps HTTP/1.1 and was very much designed for browsers. Benefits to RESTful interactions are definite but incidental to the design. There is some talk of an HTTP/3 which tackles this but it sounds like wishful thinking at this point. --- Ganga: an interface to the LHC computing grid (Matt Williams) Map/reduce for the LHC. You submit jobs, either to a local backend or the LHC grid, and it does them for you. They have a remote filesystem so you don't have to download gigabyte datafiles locally. Pre-dates OpenStack. --- Advanced py.test fixtures (Floris Bruynooghe) http://pytest.org/latest/contents.html Py.test looks interesting. I can't say for certain it's better than our current subunit/testr/mock/tox stack or even how many of those it replaces. It's hard to imagine it's any worse though. The fixtures offer: * autowired dependency injection * parameterisation - multiple tests are generated so you know which values failed * finalization * markers - general tagging mechanism, you can restrict a test run to EG not run DB tests. You can reflect on markers from fixtures, to EG skip tests if a server is unavailable (and fail if the test runner is CI). Unrelated to PyCon, Matt W found http://pythonhosted.org/behave/ for BDD. Matt + I have both done BDD previously and found it beneficial. --- Keynote: Lessons from Strangers (Rachel Sanders) http://www.slideee.com/slide/pycon-uk-2014-keynote * "People are happiest when they can get their work done." This is an MBA-level truth bomb. * People don't want to use your product, they want to HAVE used it. * Nobody wants to read the damn manual. Or can remember 850 pages. Build clean APIs. Do UX. Listen. Empower. Do RCA. Plan for mistakes. * Read "The Design of Everyday Things", by Don Norman (on my Amazon wishlist now!) --- When Performance matters (Marc Andre-Lemburg) --- The High Performance Python Landscape (Ian Ozsvald) https://github.com/egenix/when-performance-matters Ian gave me (not just me) a free book! If anyone wants a browse, it's on my shelf. Step #1, as any fule kno, is to profile. Some tools: cProfile, line-profiler, memory-profiler. Run Snake Run is a visualizer for cProfile. There are some simple optimisations that speed up regular Python code. For example list comprehensions are faster than for loops; #join is faster than string interpolation (duh); string dict lookups are slow, use ints or interned strings; exceptions are terrible if triggered (~625ms not ~40ms). Note this is all for high performance Python, not necessarily the Pythonic or readable way to do it. The next step is using C extensions for your tight loops, EG looping in numpy/lxml; the operator module; cStringIO. You can save memory using slots or the Flyweight pattern (Design Patterns: not just for Java!). Beyond this... PyPy is a Python interprester with a JIT, which makes it much faster than CPython (the regular interpreter). It supports cffi (so you can keep using C extensions like numpy) although the callout to these is slower than from CPython. Note PyPy is *slower* than CPython until the JIT kicks in. Cython can compile your Python to C but you need type annotations which make it no longer Python. You can autogenerate annotation comments with ShedSkin, which could be manually converted to Cython types. Or you could use Pythran which uses the comments directly. However, the numba lib manages equivalent (iirc) speedups using decorators to mark functions for JIT. I don't see why anyone would use Cython or Pythran, JITs rule OK. If your problem is vectorizable, you have a nuclear option in OpenCL. You can use that on your CPU or GPU. --- Simulating Quantum Systems in Python (Katie Barr) http://physik.uni-paderborn.de/?id=178571 (not from KB) I met Katie while getting some air. She was quite irate about HP cancelling their quantum research programme! Her talk was on simulation of discrete-time quantum walks across a 2D array. Although the science part was a bit boggling, the Python part was pretty simple. The probability wave spreads across the array, attenuating as it extends but building interference patterns. One location, marked by the coin flip used there being unfair, builds up a high probability. The discrete-time quantum walk can find this location in O(log(n)) iterations, which is provably minimal. --- Use of OpenStack CI for your own projects (Yolanda Robla Mota) Yolanda from the Gozer team gave a comprehensive if whirlwind tour of Gerrit, Zuul, Jenkins, JJB et al. Not many people turned up to be honest, I think it went way over most people's heads and/or needs. --- The IPython Notebook is for everyone (Gautier Hayoun) http://opentechschool.github.io/python-data-intro/core/notebook.html https://cloud.sagemath.com/ - try it out (not GH's links) The IPython Notebook is a web UI to the IPython shell. This means you don't want to make a server publicly accessible! However it's pretty fantastic for playing with "what if?" scenarios or creating tutorials. I can see great application in classrooms, where you put a prewritten notebook on each student's PC and let them tinker with it. --- Stormy Webber (Wes Mason) I missed most of this talk but essentially Wes presented Tornado, a web framework and async networking lib. Think node.js for Python. --- Functional Programming and Python (Pete Graham) Unfortunately Pete talked mostly about the advantages of functional programming, with little proof or demonstration. Functional programming did not feature heavily in conference discussion at all. --- Building great APIs in Python (Paul Hallett) a) Use REST; b) keep it simple. I did learn about the PATCH verb, which is what you should use to PUT a partial resource IE do a partial update. HATEOAS - horrible acronym for using URIs to identify resources in REST APIs instead of UUID, numeric ID or common name. Tastypie is a neat lib for making resources available, offering an automatic endpoint list. Httppy gives you a shell from which you can GET resources easily. --- How does a spreadsheet work? (Harry Percival) Harry talked us through Dirigible, which has been open-sourced. I think he said it was going to be a product but PythonAnywhere decided there wasn't a market. He agreed that in practice, you should probably just use IPython Notebook. It's generally pretty simple, he started out just eval'ing formulae then making cell value substitutions. You build a tree to handle dependencies. Engaging, fast-paced speaker, his site is obeythetestinggoat.com and he wrote O'Reilly "TDD with Python". Worth seeing just for entertainment. --- Keynote: A time traveler's guide to Python (Jessica McKellar) (I've tweeted her to ask for slides, not available yet afaict) I thought this was going to be a fluff talk initially. Jessica spent 20mins talking about Python's history then she blew us all away. Turns out she's a startup founder, a PSF director (obviously an engineer) and would like the Python community to shut the hell up about 2 vs 3. Instead she would like us to focus on growing market share through the traditional mechanisms of a startup. For example, a focus on the onboarding experience and seizing opportunities to push Python EG in schools. I was slightly creeped out by zero-sum language but I trust it was just phrasing. It was refreshing to hear someone applying time-proven business lessons on growing adoption to a programming community. Jessica mentioned mobile as something to talk about. Initially she didn't offer an opinion but after being asked directly she said she'd like to see something. --- Using Python to improve government (Michael Brunton-Spall) https://speakerdeck.com/bruntonspall/using-python-to-improve-government-pycon-uk-2014 I first (and last) saw MBS at FPDays, talking about Scala at the Guardian, so I was interested in his shift. He spent a lot of time debugging the JVM GC apparently and had strong moral reasons for wanting to work in the public sector. His talk was inspiring, he was hired by the Cabinet Office to help sort out govt IT. The DirectGov initiative has had some success in publishing but MBS was asked to help with transactions. He first worked with the Insolvency Service. He took a .NET team and, with the help of a big Cabinet Office stick, took them right out of their comfort zone to Python. This stopped them retaining any bad practices without being too hard (Haskell). He taught Agile, storyboarding, composition, decomposition, recursion, UX and many other things. Specifically they worked on the redundancy payments system. He showed an interaction diagram from hell, it seems the service user had to submit an initial 16-page form then have several follow-on interactions. Bearing in mind similar points to Rachel Sanders' talk, and that people claiming redundancy benefits really do just want to get paid not fill out forms, they converted this into a wizard with recurring sections, usually only asking for 2-3 pieces of info per page. Finally he brought the whole team to PyCon! They all seemed engaged and really proud of the process and results. --- The Minecraft Challenge (Katie Bell) Katie's challenge was to create a way to script Minecraft that preserved the nature of the game, that was approachable and allowed multiple participants at once. She achieved this by creating a scripting interface to control a robot. She had a few demo scripts, EG to mine for iron or build a stone hut. She also dropped her robot pumpkin on a chicken. Feathers and applause everywhere (the robot is represented in game by a pumpkin). An important part of her message to children is that when you get bored of something, you can script it. The robot style of scripting supports this goal of playing the game, just more efficiently. The (more powerful) random access scripting style on the other hand makes it a different game. --- Dr. Jython; or, How I Learned to Stop Worrying and Love the JVM (Naomi Ceder) http://j.mp/1rlhaqu Contrary to the title, Naomi did not seem to love the JVM. The talk was a bit agonising for a Java native; she complained about Ant (well yeah, Ant is a bad time in a box) but preferred Make and managing her classpath with shell scripts. She complained about how hard it was to pip install packages instead of simply embracing JVM libraries. Finally she compared bad Java code against good(ish) Python code, which is unfair and annoys me. On the plus side, she achieved what she wanted to, which was talking to a SOAP service with buggy WSDL. This despite the latest released version of Jython being 2.2 (she used the beta 2.7). She did mention there's no GIL, hence no need for b****y eventlet. --- Keynote: Miss Adventures in Raspberry Pi (Carrie Anne Philbin) Carrie talked about taking Raspberry Pis into schools to teach Python with. I spent a lot of the talk hoping she'd mention Barefoot, she did mention CAS. She talked about Scratch and how although it was great it was also important to graduate students to text languages. --- Miscellanea https://www.nanoporetech.com/ had a stand with several DNA sequencers, each ~$1000 and smaller than a phone. Seemed worth mentioning. http://bazaar.launchpad.net/~ubuntuone-hackers/conn-check/trunk/view/head:/README.rst conn-check knows how to check that all sorts of services are available. Pyfakefs is good for stubbing file IO. http://python-namibia.org/ is a thing. Unrelated to PyCon but completely badass: http://t.co/MysUpW7mU3 There was basically no discussion of static type systems whatsoever. I was pleased to meet not one but two prototypical PyLadies. We'll see if a group can form. Unfortunately one just started a job and the other can't code yet.