-
Notifications
You must be signed in to change notification settings - Fork 157
Description
Learn how to use AI to extract information from websites in this practical course, starting from the absolute basics.
In this course we'll use AI assistants to create an application for watching prices. It'll be able to scrape all product pages of an e-commerce website and record prices. Data from several runs of such program would be useful for seeing trends in price changes, detecting discounts, etc. The end product will, unlike programs vibe-coded carelessly, reach the level of quality allowing for further extensibility and comfortable maintenance, so that it can be published to Apify Store.
What we'll do
- Use LLM_CHAT (TBD) to create a program which extracts data from a web page.
- Save extracted data in various formats, e.g. CSV which MS Excel or Google Sheets can open.
- Use LLM_AGENT_GUI_OR_TUI (TBD) to improve the program so that it is robust and maintainable.
- Save time and effort with Apify's scraping platform.
Who this course is for
Anyone with basic knowledge of chatting with an AI assistant and affinity to building digital products who wants to start with web scraping can take this course. The course does not expect you to have any prior knowledge of web technologies or scraping.
Requirements
- A macOS, Linux, or Windows machine with a web browser.
- Prior experience chatting with AI assistants, such as OpenAI ChatGPT, Google Gemini, or Anthropic Claude.
- Familiarity with running commands in Terminal (macOS/Linux) or Command Prompt (Windows).
Lessons
- Chatting to code - Copying from chat, pasting to local files. Running with uv (should be easy-ish?). Creating a basic scraper which does what we need.
- Using an agent - Explaining benefits (delegation and independent work, AGENTS.md). Getting environment ready, learning the ropes with a GUI/TUI. Using the
apifyCLI to start a project. Creating a basic scraper which does what we need. - Docs-driven prompting - Improving the README, e.g. input output. Pointing the agent to the README and turning the design to reality.
- Test-driven prompting - Adding fixtures, expectations. Setting up tests and teaching the agent to run tests. Dealing with corner cases by pointing the agent to the fixtures.
- Using a platform - Deploying to Apify and reaping the benefits of the platform. Running the scraper periodically, adding support for proxies...