Tags » python

Meta Tech Podcast #57 - Writing and Linting Python at Scale

I joined Pascal Hartig on the Meta Tech Podcast, episode #57, to talk about Fixit, the new open source linter I’ve been working on, how and where Python is used at Meta, and how the Python Language Foundation team supports thousands of engineers, data scientists, AI and ML researchers, and anyone else who uses Python to get their work done.

You can find the episode in your podcast app of choice, or listen right here:

Episode photo, of Pascal and me laughing

Fixit 2: Meta’s Next-Generation Auto-Fixing Linter

Let’s talk about my newest open source project, built as part of my effort to improve the linting ecosystem at Meta:

This year, we have been building a new linter, Fixit 2, designed from the ground up to make developers more efficient and capable, both in open source projects and the diverse landscape of our internal monorepo. At Meta, we are using Fixit 2 with a few early adopters, and plan to roll it out to the rest of our monorepo soon. But any developer can use it to perform auto-fixing more efficiently and make faster improvements to their own codebases.

Something I don’t get to talk about enough is just how much I truly appreciate the ability to spend my time at Meta solving problems with open source, and that I can share my work with the rest of the world. Fixit is the third open source project that I’ve had the privilege of releasing at Meta, and I believe Fixit makes it easier than ever before to build new lint rules for Python, a powerful tool for any Python team.

Writing a new lint rule can be done with less than a dozen lines of code, and test cases are defined inline. You can even place it right next to the code that it will be linting:

# teambread/rules/hollywood.py
import fixit
import libcst

class HollywoodName(fixit.LintRule):
    VALID = [...] # no lint errors here
    INVALID = [...] # bad code samples here
      
    def visit_SimpleString(self, node: libcst.SimpleString):
        if node.value in ('"Paul"', "'Paul'"):
            self.report(node, "It's underbaked!")

Unlike other new linters on the market, Fixit is uniquely focused on reducing the barriers to writing high quality, auto-fixing lint rules in pure Python, without needing to learn Rust or any other language. We believe that having targeted lint rules with auto-fixes is key to improving developer productivity even more than having the fastest possible linter:

When running Fixit 2 with auto-fixing lint rules, any code that triggers the lint rule is an opportunity to get an automatic replacement, improving the codebase with less effort from the developer. Applied more broadly, Fixit 2 can even be used as a tool to enact sweeping codemods against a large codebase while leaving a lint rule in place to handle any matching code in the future.

Fixit 2 is already available on PyPI. You can pip install fixit and start using it with zero configuration.

We have a roadmap with plans for future improvements and features, and a rich set of documentation and user guides to help you get started with Fixit 2 in your own projects or repositories.

Talk @ PyCon US 2022: Open Source on Easy Mode

My PyCon talk “Open Source on Easy Mode” is available to watch on YouTube, but I made a bold decision this year to give my talk using an iPad.

It turned out to actually be “PyCon Talk on Hard Mode” instead, because nothing went right.

It’s March, and I’m writing my talk. I care enough about style, typography, and legibility that I need tight control over fonts and themes. I lean heavily on Keynote and Adobe Fonts, with screenshots of oversized 42pt type in my editor of choice to get the best results.

And because I’m never good at extemporaneous speaking, I write a full script ahead of time, and keep that script in the speaker notes, so that in the worst case, I can fall back to orator mode. This all means I need tight control over the device that I’m presenting from.

But the only laptop I have is my employer’s, where installing fonts is complicated and a grey area that would require mixing my personal Adobe account with professional work tools. Every other device I own is a desktop, tablet, or phone.

Even more, I’ve never liked having to bring a big bag and heavy laptop just to give a presentation. Carrying four pounds of gear around all day sucks, and I’d much prefer something small and light enough to fit in my everyday bag.

That’s when I get The Idea:

“Why not give the talk using my iPad Mini?”

Modern iOS includes Keynote, and I can use the Adobe app to install my fonts system-wide, just like on my Mac, and Keynote can use them perfectly fine.

I pulled out my USB-C HDMI dongle, and while normally the iPad just mirrors the device over HDMI, Keynote will actually show the slides at full screen over HDMI, while showing the presenter notes on the iPad itself, just like a laptop. It would even render 16:9 when presenting!

I double check on my monitor’s input stats, and it’s outputting a 1440p signal. This might actually work!

It can’t be this easy, right? Am I crazy?

Seriously considering giving my PyCon talk using just an iPad and leaving my laptop at home. 👀

— Amethyst Reese ✊🏳️‍🌈🏳️‍⚧️ (@n7cmdr) March 29, 2022

I tried it with other monitors and my TV, and it seemed flawless. I did full dry runs using my iPad, and even paired my trackpad and keyboard to the iPad and edited the rest of the talk on the iPad. I was convinced I wouldn’t need my laptop.

A week before the event, I committed.

We fly to SLC, and I’m continuing to practice—and edit—my talk on the iPad. The morning of the talk, I immediately head to the green room and test my setup.

No issues.

This is actually happening!

I come back after lunch. The volunteer walks me to the room I’m presenting in.

I say hi to the staff, drop off my bag, get mic’ed up, walk up to the podium, plug in HDMI, trigger Keynote, and …

The A/V system is only rendering my slides on the left half of the screen, squashed into the wrong aspect ratio.

It’s OK, we’ll just unplug and re-plug.

But no matter what I or the A/V staff do, my slides won’t show up correctly.

We delay the start of the talk; people are still filing in bit by bit. The host tries to lighten the mood and asks me a couple filler questions to stall for time.

Time flies while we’re desperately waiting to see if this can get fixed. Before I know it, it’s about 8 minutes past the original start time, and the host asks if I just want to get started anyways.

I say yes, because bad slides are at least better than no slides—or no talk.

Now, my topic was intentionally broad in scope, and I had a lot to cover. I had already planned to talk fast to get through everything. During rehearsals, I cut content so that I could consistently finish at almost exactly the 30 minute time slot.

But now I’m already behind.

I had Keynote set up to show me a timer, so I would know how I was doing on time. Ten minutes into the talk, and the host is already showing me the “15 minutes left” card. I’m talking as fast as I can, but also stumbling under the pressure. I’ve gone from speaking to reading.

At 13 minutes, an A/V expert manages to fix the output of my slides, but it’s too late for the demo of “thx” that I was the most excited to share. I wouldn’t find out that they were fixed until much later. Nothing mattered if I couldn’t finish the talk.

Moments later and they show me the “10 minutes left” card. They’re not giving me the full half hour. I start dropping phrases or sentences where I can to make up. I don’t think I’ve ever talked this fast in my life.

At 18 minutes, they give me the “5 minute warning”.

I make it to the final section (well ahead of my original expected pace, mind you) and get the STOP card. I power through 8 slides in 40 seconds, but the “ums” return as I have to make it all up on the spot.

I close out the talk with a total runtime of just about 26 minutes.

The polite applause starts, and I manage to get myself off stage with a shred of dignity in tact. As they’re removing the mic, I thank and apologize to the A/V crew for causing so many problems. 😅

I find out that—for reasons I will never understand—the iPad was only negotiating a resolution of 1080i (interlacing!) instead of the 1080p that their system was expecting.

Why 1080i? Because technology!

And why did that break the A/V layout? No idea.

But thankfully, even after my mediocre delivery, the smashed slides, and flickering output, there were still folks that stopped by to say hello, compliment my talk, and ask questions. Sadly, my brain was toast, so I have little memory of who they were or what we talked about. 😓

I don’t know if I’ll have the opportunity to give this talk again in the future, but I’m planning to re-record it at home soon. Maybe I can talk at a more reasonable pace, and actually make it through with all of my content in one piece.

But there’s one thing I know I’m doing the next time a talk gets selected for PyCon, or any of the wonderful regional conferences I love attending:

I’m using my iPad to present! 💪

Talk Python #304 - asyncio all the things with Omnilib

I joined Michael Kennedy on the excellent Talk Python podcast, episode #304, last week to chat about the Omnilib Project, an organization of open source packages I started, and how they fit into the modern world of Python and AsyncIO. We also discuss how I got started in programming and Python, with a rare glimpse into just how nerdy I was as a child. 😅

Listen in Overcast, any podcast player, or right here:

Talk @ North Bay Python 2019: What is a Coroutine Anyway?

Back in 2019–remember the before times?–I gave a talk at North Bay Python 2019 on the topics of coroutines, AsyncIO, and how they work. It’s a crazy journey from bytecode and runtime instructions to building our own terrible event loop in pure Python using nothing but generators and tears. Enjoy!

And if that was too slow, don’t miss the shorter rendition of this talk that I gave at PyCascades 2020, right before COVID ruined everything.

Talk @ PyCon AU 2018: Refactoring Code with the Standard Library

A week ago today, I gave a talk at PyCon Australia in Sydney. I discussed refactoring in Python, and how to build refactoring tools using nothing but the standard library, building up from concepts to syntax trees to how the lib2to3 module works. The talk finished with the announcement of the open source refactoring framework I built at Facebook, Bowler, which I built using those same concepts. I really appreciated the questions from the audience and in the halls afterwards, and we really enjoyed our time in Sydney. Thanks to everyone that made PyCon so great!

Talk @ PyCon US 2018: Thinking Outside the GIL

This past Friday, I presented a talk at PyCon US 2018 – the first time I’ve been fortunate enough to attend. The talk was focused on achieving high performance from modern Python services through the use of AsyncIO and the multiprocessing modules. The turnout was better than I could have ever expected, and I was really happy to hear from everyone that stopped by the Facebook booth to ask questions, discuss Facebook engineering practices, or even just say “hello”. Thank you to everyone who made my first PyCon amazing!

Interview: PyDev of the Week

This week, I was interviewed by Mike Driscoll of Mouse vs Python for their “PyDev of the Week” series, focusing on developers in the Python community.

A short quote, and timely announcement:

I’m currently preparing a talk for PyCon 2018 on using asyncio with multiprocessing for highly-parallel monitoring and/or scraping workloads. To go with this talk, I’m working on some simple example code that I hope to publish on Github. This will be my first major conference talk, so I’m both excited and absolutely terrified! 😅

I’m looking forward to giving that talk, and will post a video here afterwards!

Python: Using KeyboardInterrupt with a Multiprocessing Pool

I’ve recently been working on a parallel processing task in Python, using the multiprocessing module’s Pool class to manage multiple worker processes. Work comes in large batches, so there are frequent periods (especially right after startup) where all of the workers are idle. Unfortunately, when workers are idle, Python’s KeyboardInterrupt is not handled correctly by the multiprocessing module, which results in not only a lot of stacktraces spewed to the console, but also means the parent process will hang indefinitely.

There is quite a lot of suggestions for mitigating this issue, such as given in this question on Stack Overflow. Many places point to Bryce Boe’s article, where he advocates rolling your own replacement for the multiprocessing module’s Pool class, but that seems to not only invite bugs and added maintenance overhead, but also doesn’t address the root cause.

I have figured out (what I think is) a better solution to the problem, and have not found anyone else mentioning it online, so I have decided to share that here. It not only solves the problem of handling the interrupt for both idle and busy worker processes, but also precludes the need for worker processes to even care about KeyboardInterrupt in the first place.

The solution is to prevent the child processes from ever receiving KeyboardInterrupt in the first place, and leaving it completely up to the parent process to catch the interrupt and clean up the process pool as it sees fit. In my opinion this is the most optimal solution, because it reduces the amount of error handling code in the child process, and prevents needless error spew from idle workers.

The following example shows how to do this, and how it works with both idle and busy workers:

#!/usr/bin/env python

# Copyright (c) Amethyst Reese
# Licensed under the MIT License

import multiprocessing
import os
import signal
import time

def init_worker():
    signal.signal(signal.SIGINT, signal.SIG_IGN)

def run_worker():
    time.sleep(15)

def main():
    print "Initializng 5 workers"
    pool = multiprocessing.Pool(5, init_worker)

    print "Starting 3 jobs of 15 seconds each"
    for i in range(3):
        pool.apply_async(run_worker)

    try:
        print "Waiting 10 seconds"
        time.sleep(10)

    except KeyboardInterrupt:
        print "Caught KeyboardInterrupt, terminating workers"
        pool.terminate()
        pool.join()

    else:
        print "Quitting normally"
        pool.close()
        pool.join()

if __name__ == "__main__":
    main()

This code is also available on Github as amyreese/multiprocessing-keyboardinterrupt. If you think there’s a better way to accomplish this, please feel free to fork it and submit a pull request. Otherwise, hopefully this helps settle this issue for good.