Modern Dictionaries by Raymond Hettinger

By: SF Python

590   9   39293

Uploaded on 12/16/2016

Abstract
Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk is given by Raymond Hettinger, the Python core developer responsible for the set implementation and who designed the compact-and-ordered dict implemented in CPython for Python 3.6 and in PyPy for Python 2.7. He will use pictures and little bits of pure python code to explain all of the key ideas and how they evolved over time. He will also include newer features such as key-sharing, compaction, and versioning. This talk is important because it is the only public discussion of the state of the art as of Python 3.6. Even experienced Python users are unlikely to know the most recent innovations.

Who and Why (Audience):
This talk is for all Python programmers. It is designed to be fully understandable for a beginner (it starts from first principles) but to have new information even for Python experts (how key-sharing works, how the compact-ordered patch works, how dict versioning works). At the end of this talk, you can confidently say that you know how modern Python dictionaries work and what it means for your code.

Bio
Raymond Hettinger has also served as a director of the Python Software Foundation, and has mentored many people over the years on their contributions to the python-dev community. He is also well known for his contributions to the Python Cookbook, and shares many pieces of Python wisdom on Twitter. He is a frequent keynote speaker at Python Conferences around the world and has received the Distinguished Service Award at PyCon 2014 for his exceptional contributions to the python community.

Other info:
This talk is delivered at SF Python's 2nd Annual Holiday Party for Python Devs in SF Bay Area, CA. In you are in San Francisco area looking to meet other python devs, please check our schedule for meetups on http://sfpython.org

Comments (8):

By Dowwie    2017-09-20

Raymond Hettinger made an interesting talk about Python's dict (hash table) and how Python's core development team evolved it over time. He's giving the polished version of talk at PyCon this May but if you can't wait for it, here's a rough version: https://www.youtube.com/watch?v=p33CVV29OG8

Original Thread

By StreakyCobra    2018-02-07

This has been changed in Python 3.6 only, due to a change in dictionary implementation to make them more efficient. "Modern Dictionaries by Raymond Hettinger" [1] is a quite interesting and technical talk about these changes, explaining also the change in iteration order IIRC, worth a watch in my opinion!

[1] https://www.youtube.com/watch?v=p33CVV29OG8

Original Thread

By svieira    2018-04-17

Python dictionaries: A Confluence of Great Ideas (also known as Modern Dictionaries) by Raymond Hettinger

https://www.youtube.com/watch?v=p33CVV29OG8

Original Thread

By anonymous    2017-09-20

Let me address some issues you've raised in your question.

Here's the thing, right now the search operation is linear time and I'd like it to become constant time and I'm wondering about a simple/lightweight way to achieve this.

A simple lightweight way to achieve this, i.e., to have an associative array (a.k.a. key-value-store), is to use one provided by the standard library.

You are coding in a recent version of C++, you are in the lucky position that the standard library actually provides one that satisfies your constant-time requirements:

The implementation of the data structures shipped as part of a standard library of any decent compiler these days, are probably better than anything you could come up with. (Or why did you ask for give me the code?).

I've seen hashtables are a possible option but I'd prefer not having to implement a hash function per object. Maybe something similar to an unordered_map... dunno.

A std::unordered_map actually is a hash table, and as you can see in the docs, it takes a hash function. As you can see written in the docs there are lots of specializations for lots of types already available, that can help you derive a custom hash function for your custom object types:

Could anyone give some ideas or maybe providing a simple lightweight implementation of what I'm trying to achieve here?

Just have a look at the example code to std::unordered_map to see how it's used. If you worry about performance, don't forget to measure. If you really want to consume some input on implementation of hash tables, I liked these talks on the Python dictionary:

Also have a look at the wikipedia page (if you haven't already):

In this fictional example I'm using std::vector to avoid making the question bigger and more complex than what it is but my the real use-case won't be using the STL at all (ie: i'll be coding my own custom implementation of std::vector)

Unless you are doing this for educational/recreational purposes, don't do it. Don't be ashamed to base your endeavours on the shoulders of giants. That the standard library wasn't invented in your project is not a problem.

Original Thread

By anonymous    2017-09-20

It's a new era and with Python 3.6.1 dictionaries now retain their order. These semantics aren't explicit because that would require BDFL approval. But Raymond Hettinger is the next best thing (and funnier) and he makes a pretty strong case that dictionaries will be ordered for a very long time.

So now it's easy to create slices of a dictionary:

test_dict = {
                'first':  1,
                'second': 2,
                'third':  3,
                'fourth': 4
            }

list(test_dict.items())[:2]

Original Thread

By anonymous    2018-02-12

That's because you're using sequential keys. Take a look at [Raymond Hettinger's excellent talk](https://www.youtube.com/watch?v=p33CVV29OG8).

Original Thread

Popular Videos 543

Submit Your Video

If you have some great dev videos to share, please fill out this form.