Scrape the Web: Strategies for programming websites that don't expect it (Part 003)

[VIDEO HAS ISSUES: Speaker walked away from the mic most of the time.] Do you find yourself faced with websites that have data you need to extract? Would your life be simpler if you could programmatically input data into web applications, even those tuned to resist interaction by bots? We'll discuss the basics of web scraping, and then dive into the details of different methods and where they are most applicable. You'll leave with an understanding of when to apply different tools, and learn about a "heavy hammer" for screen scraping that I picked up at a project for the Electronic Frontier Foundation. Atendees should bring a laptop, if possible, to try the examples we discuss and optionally take notes.

More episodes of PyCon US Videos - 2009, 2010, 2011

Featured episodes in Learning

PyCon US Videos - 2009, 2010, 2011

PyCon is an activity of the Python Software Foundation, a 501c3 non-profit organization. To support future conferences, please donate to the Foundation at www.python.org/psf/donations . Video and audio material from PyCon are licensed under the Creative Commons CC-BY-NC-SA license . This means you can incorporate excerpts or entire recordings in your own non-commercial projects, as long as you credit the speaker and you CC-license the finished project.