◄︎ Gregor's Portfolio Site
09 Aug 2019

Build a scheduled web scraper with lambda

AWS Lambda is one of my favorite cloud services. The only limits are the imagination and skillset of the developers using it. Common categories of use include event driven processing, batch data processing, log ingest and analytics, and automation of the AWS environment. I've seen Lambda function as the backend for websites, as an engine to customize email delivery campaigns, as a monitor and control for other systems, and as the brain behind trendy online chatbots and Slack bots. For this example we will use Lambda to build a scheduled task and automatically scrape top artists information from the iTunes website. Here are the services we'll be using:

  • Lambda (code execution)
  • S3 (static file hosting)
  • CloudWatch (scheduling)
  • A note on cost: Lambda offers the first 1 million requests per month for free. Beyond that it adheres to a standard AWS pricing model, with the cost coming out to $0.0000002 per request in US East.

    Identifying Our Source Data

    Create an S3 bucket

    Setting up our Lambda Function

    Working With The Lambda Code

    Scheduling Our Function