Powershell : Get example Sentence’s for a Word using Web scraping on online dictionary


INTRODUCTION : 

Everybody comes across a word that you don’t understand how to use it in a sentence, I face this often as I do ton of readings. normally I would have done a simple google search, let’s suppose for the word “Elixir”, which will give me few websites with sentence examples.

srch

I would have opened one of these websites and got the example sentences, but I noticed some uniformity in data presentation and the URL on a website yourdictionary.com, upon inspecting the source code I easily traced out the HTML Tags in which data was enclosed.

inspect

Hence, I thought why not harvest this website’s data (Data Scraping) and get all sentences for a word.

HOW IT WORKS :

To implement this solution using Powershell, I identified the HTML Tag in which data was residing and its class (“Li_Content”) to filter exactly the sentences I want.

Once I had the sufficient information a simple Invoke-Webrequest to the site with my query word (“Elixir”) following the URL  did most of the work

Invoke-WebRequest "http://sentence.yourdictionary.com/Elixir" 

Then some data wrangling on the HTML tag and class to extract the sentences, which would look like in the following image

1

HOW TO USE IT :

Run the function ‘Get-Sentence‘ with your word and use -WordLimit parameter to control the length, or -Count parameter to number of sentences

2

You can also use -HighlightWord switch to make highlight the Word you queried in each sentence.

3

Following animation also demonstrate how to run the function

ezgif-com-optimize

SCRIPT

Function Get-Sentence
{
[cmdletBinding()]
[alias('gs')]
param(
[parameter(mandatory=$true)] [String]$Word,
[int] $count = 10,
[int] $WordLimit,
[Switch] $HighlightWord
)
Try
{
Write-Verbose "Sending Webrequest to http://sentence.yourdictionary.com/$Word for sentences"
$Results = Invoke-WebRequest "http://sentence.yourdictionary.com/$Word" -TimeoutSec 5 -DisableKeepAlive
$ErrorMsg = "Couldn't find any sentences with word `"$($Word.toupper())`", please try again with another word "
# Condition to check if data is returned or not
# In response to the Web request
If($Results)
{
$i=0
Write-Verbose "Harvesting data from web request"
# Filtering out sentences from the data harvested from the website
$Data = $Results.ParsedHtml.getElementsByTagName('Div')| Where{$_.ClassName -eq 'li_content'}
# Condition to check Data contains Sentences or not
If($Data)
{
Write-Verbose "Populating the output"
$Sentences = Foreach($Sentence in $Data)
{
$WordCount = $Sentence.textContent.Split(' ').count
# Filter out Sentence that not comply the word limit
If($WordLimit -and $WordCount -le $WordLimit)
{
$i=$i+1
''|Select @{n='#';e={$i}},
@{n='WordCount';e={$WordCount}},
@{n='Sentence';e={$Sentence.textContent}}
}
elseif(-not $WordLimit)
{
$i=$i+1
''|Select @{n='#';e={$i}},
@{n='WordCount';e={$WordCount}},
@{n='Sentence';e={$Sentence.textContent}}
}
}
$Sentences = $Sentences| Select -First $count
# Condition and Logic to highlight the word
# For which you're looking for sentence examples
If($HighlightWord)
{
$Sentences.sentence | ForEach-Object {
$Words = $_.split()
$Words | ForEach-Object {
If($_ -like "*$word*")
{
Write-Host "$_" -NoNewline -Fore Black -Back Yellow;
Write-Host " " -NoNewline
}
else
{
Write-Host "$_ " -NoNewline
}
}
[System.Environment]::NewLine
}
}
else
{
$Sentences
}
}
Else
{
Write-Host $ErrorMsg -ForegroundColor Red
}
}
else
{
Write-Host $ErrorMsg -ForegroundColor Red
}
}
catch
{
Write-host "ERROR: $_" -ForegroundColor Red
}
}

Have fun exploring this script and Enjoy you weekend Powershell homies 🙂

Prateek Singh

 

 

3 thoughts on “Powershell : Get example Sentence’s for a Word using Web scraping on online dictionary

Leave a comment