Published on

Real Web Scraping (And how to avoid it)

Authors
    avatar

    John Partee

For whatever reason, I get paid money to scrape data pretty often. Here's how Andy and I handle it, and how you can work faster by avoiding it. There are code snippets included that cover how to scrape and how to scrape really fast. The GitHub links are included for convenience below. Linkedin, email, or tweet us with questions! Huge thanks to Andy for putting together the "API hunting" screenshots, and co-leading this lecture with me!

We didn't cover GraphQL too deeply, it's worth taking a look at deeper. It's the next "big thing" in internet stuff. This site uses it!

Andy Heroy is a healthcare data scientist with Universal Consulting Inc. You can find him on Linkedin and GitHub.

Resources

Requests-html

Async Scraping Examples

Curl to Python Converter

Intro to GraphQL

Working with JSON Data

Intro to CSS selectors (DO THIS)

Want an email when we post?

Free, weekly at most. We hate spam too.