My name is Alexis Lee. I like simplifying problems, Buddhism, co-op games and making terrible art.
Tuesday, September 30, 2014
PyCon UK 2014
Hello,
I'm never going to remember everything until next Monday, so have a
braindump. I'll put up an item for questions in that meeting. These are
the talks I went to, I've linked slides for some but not all.
Note PyPy is an alternative Python interpreter; PyPi is where pips come
from; PiPy is a Raspberry Pi lib; and if you name your next lib PiPi I
will hurt you.
--- Keynote: Ecosystem threats to Python (Van Lindberg)
Java, JS and Go. These are threats because Python has poor interop with
them. He did mention it's not a zero-sum game. There wasn't a rallying
cry or solution offered, Van just called these out.
There was a lot of instinctive dislike of Java. It got called out as a
90s ancient and enterprise-y. There didn't seem to be much understanding
of why people might choose the JVM as a platform. Scala got a brief
mention at least. Java is the native language for Android, I can't
remember if Van mentioned this but Python is shut out of mobile right
now.
JS was lumped with node.js. CoffeeScript, PureScript etc weren't
mentioned. I don't remember much beyond a few cracks really.
There was some admiration for Go, particularly the single-executable
deployment story. Speaking to Michael Foord (Go dev at Canonical)
afterwards, Go bucks the trend on exception handling and several libs
try to fix that. The performance is good though - Van has a fork of
Swift with pieces rewritten in Go and he says it's 3x faster.
--- HTTP/2: because the Web was too easy (Cory Benfield)
https://t.co/Eq7WZ96wDI - slides
Semi-recent developments in web dev have included CSS spriting,
optimising load time for 'above the fold' and aggressive minification +
concatenation of resources. Many of these are essentially hacks to
mitigate the misuse of TCP connections by HTTP. TCP connections are
intended to stay open for minutes or at least seconds, not milliseconds.
Not only do these hacks complicate the toolchain, they make caching more
difficult.
HTTP/2 offers great latency reduction by using a single connection for
multiple resources. It also allows servers to push resources you didn't
yet ask for (EG CSS, JS) so they're in your cache once you've parsed the
HTML.
The requests lib already supports HTTP/2, so from an OpenStack
perspective, when we want to switch it should be fairly easy. However it
is ~95% SPDY which wraps HTTP/1.1 and was very much designed for
browsers. Benefits to RESTful interactions are definite but incidental
to the design. There is some talk of an HTTP/3 which tackles this but it
sounds like wishful thinking at this point.
--- Ganga: an interface to the LHC computing grid (Matt Williams)
Map/reduce for the LHC. You submit jobs, either to a local backend or
the LHC grid, and it does them for you. They have a remote filesystem so
you don't have to download gigabyte datafiles locally. Pre-dates OpenStack.
--- Advanced py.test fixtures (Floris Bruynooghe)
http://pytest.org/latest/contents.html
Py.test looks interesting. I can't say for certain it's better than our
current subunit/testr/mock/tox stack or even how many of those it
replaces. It's hard to imagine it's any worse though.
The fixtures offer:
* autowired dependency injection
* parameterisation - multiple tests are generated so you know which
values failed
* finalization
* markers - general tagging mechanism, you can restrict a test run to
EG not run DB tests. You can reflect on markers from fixtures, to EG
skip tests if a server is unavailable (and fail if the test runner is
CI).
Unrelated to PyCon, Matt W found http://pythonhosted.org/behave/ for
BDD. Matt + I have both done BDD previously and found it beneficial.
--- Keynote: Lessons from Strangers (Rachel Sanders)
http://www.slideee.com/slide/pycon-uk-2014-keynote
* "People are happiest when they can get their work done."
This is an MBA-level truth bomb.
* People don't want to use your product, they want to HAVE used it.
* Nobody wants to read the damn manual. Or can remember 850 pages.
Build clean APIs. Do UX. Listen. Empower. Do RCA. Plan for mistakes.
* Read "The Design of Everyday Things", by Don Norman (on my Amazon
wishlist now!)
--- When Performance matters (Marc Andre-Lemburg)
--- The High Performance Python Landscape (Ian Ozsvald)
https://github.com/egenix/when-performance-matters
Ian gave me (not just me) a free book! If anyone wants a browse, it's on
my shelf.
Step #1, as any fule kno, is to profile. Some tools: cProfile,
line-profiler, memory-profiler. Run Snake Run is a visualizer for
cProfile.
There are some simple optimisations that speed up regular Python code.
For example list comprehensions are faster than for loops; #join is
faster than string interpolation (duh); string dict lookups are slow,
use ints or interned strings; exceptions are terrible if triggered
(~625ms not ~40ms). Note this is all for high performance Python, not
necessarily the Pythonic or readable way to do it.
The next step is using C extensions for your tight loops, EG looping in
numpy/lxml; the operator module; cStringIO. You can save memory using
slots or the Flyweight pattern (Design Patterns: not just for Java!).
Beyond this... PyPy is a Python interprester with a JIT, which makes it
much faster than CPython (the regular interpreter). It supports cffi (so
you can keep using C extensions like numpy) although the callout to
these is slower than from CPython. Note PyPy is *slower* than CPython
until the JIT kicks in.
Cython can compile your Python to C but you need type annotations which
make it no longer Python. You can autogenerate annotation comments with
ShedSkin, which could be manually converted to Cython types. Or you
could use Pythran which uses the comments directly. However, the numba
lib manages equivalent (iirc) speedups using decorators to mark
functions for JIT. I don't see why anyone would use Cython or Pythran,
JITs rule OK.
If your problem is vectorizable, you have a nuclear option in OpenCL.
You can use that on your CPU or GPU.
--- Simulating Quantum Systems in Python (Katie Barr)
http://physik.uni-paderborn.de/?id=178571 (not from KB)
I met Katie while getting some air. She was quite irate about HP
cancelling their quantum research programme! Her talk was on simulation
of discrete-time quantum walks across a 2D array. Although the science
part was a bit boggling, the Python part was pretty simple.
The probability wave spreads across the array, attenuating as it extends
but building interference patterns. One location, marked by the coin
flip used there being unfair, builds up a high probability. The
discrete-time quantum walk can find this location in O(log(n))
iterations, which is provably minimal.
--- Use of OpenStack CI for your own projects (Yolanda Robla Mota)
Yolanda from the Gozer team gave a comprehensive if whirlwind tour of
Gerrit, Zuul, Jenkins, JJB et al. Not many people turned up to be
honest, I think it went way over most people's heads and/or needs.
--- The IPython Notebook is for everyone (Gautier Hayoun)
http://opentechschool.github.io/python-data-intro/core/notebook.html
https://cloud.sagemath.com/ - try it out
(not GH's links)
The IPython Notebook is a web UI to the IPython shell. This means you
don't want to make a server publicly accessible! However it's pretty
fantastic for playing with "what if?" scenarios or creating tutorials. I
can see great application in classrooms, where you put a prewritten
notebook on each student's PC and let them tinker with it.
--- Stormy Webber (Wes Mason)
I missed most of this talk but essentially Wes presented Tornado, a web
framework and async networking lib. Think node.js for Python.
--- Functional Programming and Python (Pete Graham)
Unfortunately Pete talked mostly about the advantages of functional
programming, with little proof or demonstration. Functional programming
did not feature heavily in conference discussion at all.
--- Building great APIs in Python (Paul Hallett)
a) Use REST; b) keep it simple. I did learn about the PATCH verb, which
is what you should use to PUT a partial resource IE do a partial update.
HATEOAS - horrible acronym for using URIs to identify resources in REST
APIs instead of UUID, numeric ID or common name.
Tastypie is a neat lib for making resources available, offering an
automatic endpoint list.
Httppy gives you a shell from which you can GET resources easily.
--- How does a spreadsheet work? (Harry Percival)
Harry talked us through Dirigible, which has been open-sourced. I think
he said it was going to be a product but PythonAnywhere decided there
wasn't a market. He agreed that in practice, you should probably just
use IPython Notebook.
It's generally pretty simple, he started out just eval'ing formulae then
making cell value substitutions. You build a tree to handle
dependencies. Engaging, fast-paced speaker, his site is
obeythetestinggoat.com and he wrote O'Reilly "TDD with Python". Worth
seeing just for entertainment.
--- Keynote: A time traveler's guide to Python (Jessica McKellar)
(I've tweeted her to ask for slides, not available yet afaict)
I thought this was going to be a fluff talk initially. Jessica spent
20mins talking about Python's history then she blew us all away. Turns
out she's a startup founder, a PSF director (obviously an engineer) and
would like the Python community to shut the hell up about 2 vs 3.
Instead she would like us to focus on growing market share through the
traditional mechanisms of a startup. For example, a focus on the
onboarding experience and seizing opportunities to push Python EG in
schools.
I was slightly creeped out by zero-sum language but I trust it was just
phrasing. It was refreshing to hear someone applying time-proven
business lessons on growing adoption to a programming community.
Jessica mentioned mobile as something to talk about. Initially she
didn't offer an opinion but after being asked directly she said she'd
like to see something.
--- Using Python to improve government (Michael Brunton-Spall)
https://speakerdeck.com/bruntonspall/using-python-to-improve-government-pycon-uk-2014
I first (and last) saw MBS at FPDays, talking about Scala at the
Guardian, so I was interested in his shift. He spent a lot of time
debugging the JVM GC apparently and had strong moral reasons for wanting
to work in the public sector.
His talk was inspiring, he was hired by the Cabinet Office to help sort
out govt IT. The DirectGov initiative has had some success in publishing
but MBS was asked to help with transactions. He first worked with the
Insolvency Service. He took a .NET team and, with the help of a big
Cabinet Office stick, took them right out of their comfort zone to
Python. This stopped them retaining any bad practices without being too
hard (Haskell). He taught Agile, storyboarding, composition,
decomposition, recursion, UX and many other things.
Specifically they worked on the redundancy payments system. He showed an
interaction diagram from hell, it seems the service user had to submit
an initial 16-page form then have several follow-on interactions.
Bearing in mind similar points to Rachel Sanders' talk, and that people
claiming redundancy benefits really do just want to get paid not fill
out forms, they converted this into a wizard with recurring sections,
usually only asking for 2-3 pieces of info per page.
Finally he brought the whole team to PyCon! They all seemed engaged and
really proud of the process and results.
--- The Minecraft Challenge (Katie Bell)
Katie's challenge was to create a way to script Minecraft that preserved
the nature of the game, that was approachable and allowed multiple
participants at once. She achieved this by creating a scripting
interface to control a robot. She had a few demo scripts, EG to mine for
iron or build a stone hut. She also dropped her robot pumpkin on a
chicken. Feathers and applause everywhere (the robot is represented in
game by a pumpkin).
An important part of her message to children is that when you get bored
of something, you can script it. The robot style of scripting supports
this goal of playing the game, just more efficiently. The (more
powerful) random access scripting style on the other hand makes it a
different game.
--- Dr. Jython; or, How I Learned to Stop Worrying and Love the JVM
(Naomi Ceder)
http://j.mp/1rlhaqu
Contrary to the title, Naomi did not seem to love the JVM. The talk was
a bit agonising for a Java native; she complained about Ant (well yeah,
Ant is a bad time in a box) but preferred Make and managing her
classpath with shell scripts. She complained about how hard it was to
pip install packages instead of simply embracing JVM libraries. Finally
she compared bad Java code against good(ish) Python code, which is
unfair and annoys me.
On the plus side, she achieved what she wanted to, which was talking to
a SOAP service with buggy WSDL. This despite the latest released version
of Jython being 2.2 (she used the beta 2.7). She did mention there's no
GIL, hence no need for b****y eventlet.
--- Keynote: Miss Adventures in Raspberry Pi (Carrie Anne Philbin)
Carrie talked about taking Raspberry Pis into schools to teach Python
with. I spent a lot of the talk hoping she'd mention Barefoot, she did
mention CAS. She talked about Scratch and how although it was great it
was also important to graduate students to text languages.
--- Miscellanea
https://www.nanoporetech.com/ had a stand with several DNA sequencers,
each ~$1000 and smaller than a phone. Seemed worth mentioning.
http://bazaar.launchpad.net/~ubuntuone-hackers/conn-check/trunk/view/head:/README.rst
conn-check knows how to check that all sorts of services are available.
Pyfakefs is good for stubbing file IO.
http://python-namibia.org/ is a thing.
Unrelated to PyCon but completely badass: http://t.co/MysUpW7mU3
There was basically no discussion of static type systems whatsoever.
I was pleased to meet not one but two prototypical PyLadies. We'll see
if a group can form. Unfortunately one just started a job and the other
can't code yet.
Subscribe to:
Posts (Atom)