Scrape and download pdf files from google or bing using PowerShell

Here is a very simple script that you could execute using PowerShell ISE. It could probably be written much better, but it works. The script just uses the power of the google search engine by searching for a specific filetype. This should also work with the Bing search engine.
To make the script work, make sure you have a directory C:\temp\dwnld\ created. Also you could easily change the regular expression pattern and the keywords.

Comments with modifications on the scripts are always welcome 😉

$keywords = @("manual", "microsoft", "powershell")
$pattern = 'http://(.*?)[.]{1}pdf'
$storageDir = "C:\temp\dwnld\"
$filetype = "pdf"
$rand = New-Object System.Random

$keywords | foreach {
    $urlToScrapeWithKeyword = "http://www.google.be/search?hl=nl&tbo=d&biw=1229&bih=677&output=search&sclient=psy-ab&q={0}+filetype%3A{1}&btnK=" -f $_, $filetype
    $urlToScrapeWithKeyword | Out-Default
    (Invoke-WebRequest -UseBasicParsing -Uri $urlToScrapeWithKeyword).Links | select -ExpandProperty href | Get-Unique | foreach {
        if ($_ -match $pattern) {
            $Matches[0] | Out-Default
            try {
                Start-BitsTransfer $Matches[0] $storageDir
                "Download ok" | Out-Default
            } catch [exception] {
                "Download failed:" | Out-Default
                $_.Exception.Message | Out-Default
            }
            "Sleeping" | Out-Default
            Start-Sleep -s $rand.Next(20, 43)
        }
    }
}

Enjoy 😉
Don’t forget, web scraping can be illegal! Use it with care!

Advertisements

Raise PropertyChanged events without using strings in Prism

While I was testing something with Prism4 + MEF, I found this interesting method overload to raise PropertyChanged events without having to use the string name of the property itself.
Because having a typo in such a string can be very painful. 🙂

Let’s say you have a ViewModel which derives from the NotificationObject. In the setter I’m using the ‘better’ way of raising that PropertyChanged event:

public class TestViewModel : NotificationObject
{
private string _Title;

public string Title
{
get { return _Title; }
set
{
if (value != _Title;)
{
_Title = value;
this.RaisePropertyChanged(() => this.Name);
}
}

}
}

instead of writing:

this.RaisePropertyChanged("Title");

If you have already your project full of RaisePropertyChanged methods with strings you can easily replace them in Visual Studio with Regular Expressions:

[Ctrl]+[F]
– Select [Quick Replace]
– In the dropdown menu choose [Replace in Files] instead of [Quick Replace]
– Look in: [Entire Solution]
– check [Use] and select [Regular expressions]
– Find what: RaisePropertyChanged\(\”{[:a]*}\”\)\;
– Replace with: RaisePropertyChanged(() => this.\1);

Have fun 😉