My top preference for data munging and harvesting from The Web is Internet Explorer, Yes, Internet Explorer! 🙂 because I can create an InternetExplorer.Application object and access the HTML DOM to scrape web data as and when required.
The problem arises when all information on the web page is not populated by default when the page loads, and you’ve to manually scroll down the Internet Explorer’s scroll bar to populate more content. But how to do that programmatically?
Luckily, to our rescue .Net libraries provide a ScrollTo() function which can be utilized to scroll and populate content on a webpage, and this is very handy with web scraping techniques.
Hence this quick post for people who may find it useful, because I found the answer after a lot of research 🙂
The example in above animation is used for harvesting data from Twitter and you can find a full blog post here
Also, take a look at the web series here to know more about Data munging/harvesting and where Internet Explorer’s scrolling can be used for Web data munging and harvest data from the internet in any way possible. Above mentioned web series covers –
- Automating “From the Blog Archives” Tweets using Powershell
- Pumping Reddit user trend to AWS CloudWatch with Powershell
- Capturing & Analyzing online users Trend on Reddit with Powershell
- Powershell fiddling around Web scraping, Twitter – User Profiles, Images and much more
- Get example Sentence’s for a Word using Web scraping on online dictionary
- Get-Quote using Powershell
- PowerShell: Web-hosted Image Scraping
- [ Powershell ] Data Harvesting all dictionary words for each alphabet from Web
- PowerShell: Get Synonyms using Online Thesaurus
- Powershell: How to get Cricket Live Scores to Your Powershell Console
- PowerShell: Import / Query All Windows System Error Codes for Description
Please do follow me on twitter for more Interesting PowerShell material and don’t forget to Show-off more cool Web Scraping techniques you learn to your colleagues, thanks for reading. Cheers! 😉