SensibleDefaults
Published on

Real Web Scraping (And how to avoid it)

Authors

For whatever reason, I get paid money to scrape data pretty often. Here's how Andy and I handle it, and how you can work faster by avoiding it. There are code snippets included that cover how to scrape and how to scrape really fast. The GitHub links are included for convenience below. Linkedin, email, or tweet us with questions! Huge thanks to Andy for putting together the "API hunting" screenshots, and co-leading this lecture with me!

We didn't cover GraphQL too deeply, it's worth taking a look at deeper. It's the next "big thing" in internet stuff. This site uses it!

Andy Heroy is a healthcare data scientist with Universal Consulting Inc. You can find him on Linkedin and GitHub.

Resources

Requests-html

Async Scraping Examples

Curl to Python Converter

Intro to GraphQL

Working with JSON Data

Intro to CSS selectors (DO THIS)

Welcome New Learners!

This space is dedicated to making the future of tech approachable. We cover ChatGPT and GPT-3 today, aiming to bring big AI power to normal folks.

Some of it is technical, but a lot isn't. My team is here to learn how GPT-3 works in our daily lives, and we're going to write about it so we all benefit from it.

The future is here - And it can mean less bullshit in our lives, and more time for the things we love. But you have to use it first! We're here to help.

Feeling Overwhelmed? Check out our beginner's guide!

We love feedback good and bad - get in touch or leave a comment!

John, Founder

avatar

John Partee

@_JohnPartee

Want an email when we post?

Actually free, weekly at most. We hate spam too.