TLS and SSL are two critical technologies which underly much of the secure communications that occur on the internet. Over the past few years, spurred by increasingly effective attacks and a desire for new functionality, SSL and TLS have seen many new features, as well as practical improvements.
Python is currently in a transitional period between Python 2 and Python 3. For
the past few years, all new feature development has been happening on Python 3,
including new features in Python's
ssl module. This means that Python 3 users
have had acccess to these improvements to TLS, but Python 2 users (still the
majority of Python users) have been falling behind.
Unfortunately, missing features in an SSL/TLS stack aren't like missing features in other modules. They can prevent interoperability with the wider internet, compromise the security of connections, and there's usually no workaround. To get a sense of how bad this situation is, you can watch Hynek Schlawack's talk on "The Sorry State of SSL" from PyCon.
This situation is unacceptable, given that Python 2 is going to continue to see production usage for many years to come. As a result, we advocated strongly for PEP 466, which provides permission to backport many of these improvements to Python 2.
A few of the important features missing from Python 2 are:
ALPNsupport. NPN (Next Protocol Negotiation) was a TLS extension originally introduced by Google, which allows a connection to negotiate which application-level protocol will be used, it is currently in the process of being supplanted by ALPN (Application Level Protocol Negotiation), which is an IETF draft. These are necessary for protocols such as SPDY and HTTP/2. (ALPN is not yet in Python, because it's not yet in an OpenSSL releases, once it is the feature will be added).
To address this, we backported each of these features from the Python 3 branch to the Python 2 branch. This was a large task, because it is a huge amount of code (the resulting diff is over 12,000 lines), written for two different programming languages, with some very fundamental differences.
Our first step was to take a diff of the files that comprise the SSL module between Python 2 and Python 3. Then, we applied this diff, using a graphical merge tool and throwing away changes that were incorrect, or introduced backwards incompatibilities.
Then we made sure all of the code compiled, fixing syntax errors, and API differences between Python 2 and Python 3. And finally we ran the tests and made fixes as necessary until all of them were passing.
A few of the issues we ran across were:
enum, which didn't exist in Python 2.
unicode. Particularly, existing Python 3 code was very strict about text vs. bytes in a way that often wasn't possible to emulate in Python 2.
sslmodule is deeply coupled to the
socketmodule, and many of the internals of the
socketmodule were changed in Python 3 to fully participate in PEP 3116.
sslmodule is written in C, rather than in Python. C code is considerably more difficult to work with than Python code, while all of the Python-code differences were easy to work around, the C API differences required considerably more work.
This work is important to Rackspace for several reasons:
The patch with our work is currently in code review, and we're hoping it will be merged soon, for release in Python 2.7.9. One of our goals in working on this patch was to reduce the maintenance burden going forward, by minimizing the delta with Python 3. When new features like ALPN are added to Python 3, they should be much easier to backport to Python 2 than this was.