Loading Events

« All Events

  • This event has passed.

Web scraping workshops

April 25, 2023 @ 2:00 pm - May 2, 2023 @ 4:00 pm

Whether your motivation is personal advancement or to rival those who are benefiting from this technology, you should use the web scraping workshops as a way to enrich your comprehension with a perspective that can come in handy as a show-off to your friends in parties. In other words, even if you are not an avid programmer, use this as an opportunity to understand how people can code this and then ask ChatGPT to do it for you!

By the end of both of the workshops, you will be able to understand how data ends up on the internet and how you can acquire it for your own research purposes.

! If you participate in both workshops, you will receive a BUILD lab certificate, which will allow you to store your script on our servers for scraping purposes. !


April 18th – Part 1 | Time: 14:00 – 16:00 | Room: 3A12-14
Creating research questions, understanding web infrastructure, and utilization of APIs.

In this workshop, we will learn how to appropriately scope out a project that involves acquiring data from the internet and create a research question for it. To be able to understand what data data can be acquired, we will examine APIs and how to query them, as well as inspecting website infrastructure (HTML, CSS, and JavaScript).

If you are planning on writing your thesis next Spring, this could be a good opportunity for you to brainstorm your topic ideas or even find a likely-minded thesis partner!

No programming skills are required. (Slides can be found here)


April 25th – Part 2 | Time: 14:00 – 16:00 | Room: 3A12-14
Developing autonomous data scraping scripts using Python.

This is where you can learn how to apply programming skills and turn your idea into a full project. We will examine of how the process of fetching web page, selecting data, parsing data, storing data, repeat can be automated to its full extent by developing a script using Python and automating it using Bash.

We will learn more about webpage inspection and the Python library BeautifulSoup and how APIs can be continuously queried with the intention of storing and accumulating the data. Additionally, we will take a look at opportunities for you to use Raspberry PIs or store your scripts at BUILD lab to scrape your data for you over the summer!

Some programming abilities are required (Find the files here and slides here).


Thank you for participating!

 

Details

Start:
April 25, 2023 @ 2:00 pm
End:
May 2, 2023 @ 4:00 pm

Leave a Reply

Your email address will not be published. Required fields are marked *