<# : Batch Commands (PowerShell Comments) Start @echo off & setlocal rem rem Multi-line Regular Expression Text Search rem rem rem RegExS.bat [] [] [] rem [-e ] [-g ] rem [-h] [-i] [-n] [-r] [-s] [-x ] rem [-bc ] [-cc ] [-fc ] rem [-Q/-Interactive] rem rem rem text pattern in .NET regular expression rem "^" and "$" characters match the beginning and the end of rem lines respectively. In order to make "."(period) match to rem "\n"(newline), prepend "(?s)" to the regular expression. rem You'll be prompted for it if not specified in the command rem line arguments or when this script is started by a double click. rem target folder in which search is performed rem You'll be prompted to choose one in the popped-up selection rem dialog box if not specified. rem comma-delimited list of filenames to be searched rem Wildcards may be used. The default value is "*". rem (e.g. "*.cpp, *.h") rem -e character encoding for the files without BOMs(Byte Order Marks) rem The default value is "Default" rem -g capture-group name or number rem The default value is "0". rem -h Search result is output to a HTML file. rem -i Uppercase/lowercase are ignored. rem -n Wide characters are converted into narrow ones prior to rem search. Useful when you don't want to distinguish them. rem -r Search is performed in subfolders recursively. rem -s Search is performed by a simple text match. rem -x comma-delimited list of filenames to be excluded from search rem Wildcards may be used. (e.g. "*.img, *.dat") rem The following binary format files are excluded by default. rem "*.exe, *.dll, *.lnk, *.zip, *.bmp, *.gif, *.jpg, *.png" rem -bc background color for matches rem -cc foreground color for capture-group matches rem -fc foreground color for matches rem -Q Interactive mode is forcedly enabled. rem rem Remarks: Files with .docx/.pptx/.xlsx/.xlsm extensions are supported experimentally. rem In addition, PDF files can be searched if "itextsharp.dll" exists in the same rem folder as this script (except for rasterized texts and numbers/dates in Excel). rem rem Created by earthdiver1 rem Version 2.00 rem Licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. rem rem ---------------------------------------------------------------------------- echo %CMDCMDLINE% | findstr /i /c:"%~nx0" >NUL && set DC=1 rem The following is a preamble for converting a PowerShell script into polyglot rem one that also runs as a batch script. Change the file extension to .ps1 and rem run it from PowerShell console when debugging. set BATCH_ARGS=%* if defined BATCH_ARGS set BATCH_ARGS=%BATCH_ARGS:"=\"% if defined BATCH_ARGS set BATCH_ARGS=%BATCH_ARGS:^^=^% set P_CMD=$DoubleClicked=0%DC%;$PSScriptRoot='%~dp0'.TrimEnd('\');$input^|^ &([ScriptBlock]::Create((${%~f0}^|Out-String))) endlocal & PowerShell -NoProfile -Command "%P_CMD%" %BATCH_ARGS% exit/b rem ---------------------------------------------------------------------------- : Batch Commands (PowerShell Comments) End #> #Requires -Version 3 param ( [String]$Pattern, [String]$Dir, [String]$Include = "*", [Alias("e")][String]$Encoding = "Default", [Alias("x")][String]$Exclude = "", [Alias("g")][String]$Group = "0", [Alias("h")][Switch]$HtmlOutput = $False, [Alias("i")][Switch]$IgnoreCase = $False, [Alias("n")][Switch]$Narrow = $False, [Alias("r")][Switch]$Recurse = $False, [Alias("s")][Switch]$SimpleMatch = $False, [Alias("bc")][ConsoleColor]$BackgroundColor = "Blue", [Alias("cc")][ConsoleColor]$CapturegroupColor = "Red", [Alias("fc")][ConsoleColor]$ForegroundColor = "White", [Alias("Q")][Switch]$Interactive = $False ) #$DebugPreference = "Continue" Function RegExS { <# .SYNOPSIS Regular-Expression text Search .DESCRIPTION This function searches for text patterns in multiple text files. Matched characters are highlighted. .PARAMETER Pattern Specifies the text to find. Type a string or regular expression. If you type a string, use the SimpleMatch parameter. .PARAMETER Dir Specifies the target folder in which search is performed. .PARAMETER Include Specifies the comma-delimited list of filenames to be searched. Wildcards may be used; e.g. "*.cpp, *.h". The default value is "*". .PARAMETER Encoding Specifies the character encoding for the files without a BOM. (Alias: -e) The default value is "String". .PARAMETER BackgroundColor Specifies the background color for matches. (Alias: -bc) The default value is "Blue". .PARAMETER CapturegroupColor Specifies the foreground color for capture-group matches. (Alias: -cc) The default value is "Red". .PARAMETER ForegroundColor Specifies the foreground color for matches. (Alias: -fc) The default value is "White". .PARAMETER Group Specifies the name or number of capture group. (Alias: -g) The default value is "0". .PARAMETER HtmlOutput Redirects output to a HTML file. (Alias: -h) .PARAMETER IgnoreCase Makes matches case-insensitive. By default, matches are case-sensitive. (Alias: -i) .PARAMETER Narrow Converts wide characters into narrow ones. (Alias: -n) Useful when you don't want to distinguish between narrow and wide characters. .PARAMETER Recurse Searches all files in all subfolders. (Alias: -r) .PARAMETER SimpleMatch Uses a simple match rather than a regular expression match. (Alias: -s) .PARAMETER Exclude Specifies the comma-delimited list of filenames to be excluded. (Alias: -x) Wildcards may be used; e.g. "*.img, *.dat". Note that the following binary format files are excluded by default: *.exe, *.dll, *.lnk, *.zip, *.bmp, *.gif, *.jpg, *.png . .NOTES Author: earthdiver1 Version: V2.00 Licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. .EXAMPLE ./RegExS.ps1 """(?:[^""\\\n]|\\.)*""|'(?:[^'\\\n]|\\.)*'|(/\*[\S\s]*?\*/|//.*$)" . "*.cpp,*.h" -g 1 Comments in the C++ source files in the current work folder are searched. .REMARKS Files with .docx/.pptx/.xlsx/.xlsm extensions are supported experimentally. In addition, PDF files can be searched if "itextsharp.dll" exists in the same folder as this script. (Except for rasterized texts and numbers/dates in Excel) #> param ( [String]$Pattern, [String]$Dir, [String]$Include = "*", [ValidateSet("", "Unknown", "String", "Unicode", "Byte", "BigEndianUnicode", ` "UTF8", "UTF7", "UTF32", "Ascii", "Default", "Oem", "BigEndianUTF32")] [Alias("e")][String]$Encoding = "Default", [Alias("x")][String]$Exclude = "", [Alias("g")][String]$Group = "0", [Alias("h")][Switch]$HtmlOutput, [Alias("i")][Switch]$IgnoreCase, [Alias("n")][Switch]$Narrow, [Alias("r")][Switch]$Recurse, [Alias("s")][Switch]$SimpleMatch, [Alias("bc")][ConsoleColor]$BackgroundColor = "Blue", [Alias("cc")][ConsoleColor]$CapturegroupColor = "Red", [Alias("fc")][ConsoleColor]$ForegroundColor = "White" ) ############ Edit Here ############ $BinaryFiles = "*.exe, *.dll, *.lnk, *.zip, *.bmp, *.gif, *.jpg, *.png" $HtmlFile = "$($script:PSScriptRoot)\RegExS_Output.htm" # $HtmlFile = "$($script:PSScriptRoot)\RegExS_Output_$((Get-Date).ToString('yyyyMMddHHmm')).htm" ################################### if (-not $Pattern) { Write-Host "'Pattern' not specified" -Fore Red; Get-Help RegExS; return } if (-not $Dir) { Write-Host "'Dir' not specified." -Fore Red; Get-Help RegExS; return } $Dir = Convert-Path -LiteralPath $Dir if (-not $Dir) { Write-Host "'Dir' does not exist." -Fore Red; Get-Help RegExS; return } [Array]$IsContainer = Get-Item $Dir | %{ $_.PSIsContainer } if ((-not $Recurse) -and $IsContainer.Length -eq 1 -and $IsContainer[0]) { $Dir += "\*" } if ($SimpleMatch) { $Pattern = [regex]::Escape($Pattern) } else { $Pattern = "(?m)" + $Pattern } if ($IgnoreCase) { $Pattern = "(?i)" + $Pattern } if ($HtmlOutput) { HtmlHeader } [String[]]$IncludeList = $Include -Split "," | %{ $_.Trim() } [String[]]$ExcludeList = ($Exclude + "," + $BinaryFiles) -Split "," | %{ $_.Trim() } | ?{ $_ } $global:ErrorView = "CategoryView" if ($IncludeList.Length -eq 1) { # "-Filter" is faster than "-Include" $files = Get-ChildItem $Dir -Filter $Include -Exclude $ExcludeList -Recurse:$Recurse -File } else { $files = Get-ChildItem $Dir -Include $IncludeList -Exclude $ExcludeList -Recurse:$Recurse -File } $global:ErrorView = "NormalView" $ErrorActionPreference = "Stop" trap [System.Exception] { Write-Host "Sytem error occurred." -Fore Red Write-Host $Error[0].ToString() $Error[0].InvocationInfo.PositionMessage if ($HtmlOutput) { HtmlFooter try { [Text.Encoding]::UTF8.GetBytes($script:Html.ToString()) | Set-Content -Path $HtmlFile -Encoding Byte # UTF-8N [void]$script:Html.Clear() } catch [System.Exception] {} } return } [Int]$Nmfile = 0 [Int]$Nmatch = 0 [Int]$Nmline = 0 [Int]$Nmchar = 0 if ($files) { :LOOP foreach ($file in $files) { Write-Debug "RegExS: $($file.FullName)" try { switch ($file.Extension) { ".docx" { $io = ReadDocxFile $file.FullName ; break } ".pptx" { $io = ReadPptxFile $file.FullName ; break } ".xlsm" { $io = ReadXlsxFile $file.FullName ; break } ".xlsx" { $io = ReadXlsxFile $file.FullName ; break } ".pdf" { $io = ReadPdfFile $file.FullName ; break } Default { $enc = GetEncodingFromBOM $file if (-not $enc) { if (IsBinary $file) { Write-Host "Skipping binary format file: $($file.FullName)." -Fore Green continue LOOP } $enc = $Encoding } $io = (Get-Content $file.FullName -Encoding $enc -Raw) -Replace "\r","" } } } catch [System.UnauthorizedAccessException], ` [System.Management.Automation.ItemNotFoundException], ` [System.IO.IOException] { Write-Host "$($Error[0].Exception.Message) Continuing processing." -Fore Red if ($HtmlOutput) { [void]$script:Html.Append("$(Htmlify $Error[0].Exception.Message) Continuing processing.`n") } continue } if ($Narrow) { Add-Type -AssemblyName "Microsoft.VisualBasic" $io = [Microsoft.VisualBasic.Strings]::StrConv($io,[Microsoft.VisualBasic.VbStrConv]::Narrow) $Pattern = [Microsoft.VisualBasic.Strings]::StrConv($Pattern,[Microsoft.VisualBasic.VbStrConv]::Narrow) } [Array]$Matches = Select-String -InputObject $io -Pattern $Pattern -CaseSensitive -AllMatches ` | %{ $_.Matches } | ?{ $_.Groups[$Group].Success } if ($Matches) { $Nmfile++ Write-Host $file.FullName -Fore Yellow $bol = [Array](Select-String -InputObject $io -Pattern '(?m)^' -AllMatches | %{ $_.Matches } | %{ $_.Index }) ` + ($io.Length + 1) [Int]$nl = -1 if (-not $HtmlOutput) { if ($Group -eq "0") { for ([Int]$i=0; $i -lt $Matches.Length; $i++) { $MatchIndex = $Matches[$i].Groups[0].Index $MatchLength = $Matches[$i].Groups[0].Length $MatchString = $Matches[$i].Groups[0].Value -Replace "`n","`n`t" $NextMatch = $Matches[$i+1] if ($bol[$nl+1] -le $MatchIndex) { while ($bol[$nl+1] -le $MatchIndex) { $nl++ } $index = $bol[$nl] if ($index -ge $io.Length) { break } $Nmline++ if ($file.Extension -eq ".xlsx" -or $file.Extension -eq ".xlsm") { Write-Host $("{0,5}:" -F $script:cell[$nl]) -NoNewline } else { Write-Host $("{0,5}:" -F ($nl+1)) -NoNewline } } $Nmatch++ if ($MatchLength -gt 0) { $Nmchar += $MatchLength Write-Host $io.SubString($index, $MatchIndex - $index) -NoNewline Write-Host $MatchString -Back $BackgroundColor -Fore $ForegroundColor -NoNewline $index = $MatchIndex + $MatchLength while ($bol[$nl+1] -le $index) { $nl++ $Nmline++ } } if ($NextMatch -and $NextMatch.Index -lt $bol[$nl+1]) { continue } $eol = $bol[$nl+1] - 1 if ($eol -eq $io.Length) { $eol-- } if ($io[$eol] -eq "`n") { $eol-- } if ($index -le $eol) { Write-Host $io.SubString($index, $eol+1 - $index) } else { Write-Host } } } else { for ([Int]$i=0; $i -lt $Matches.Length; $i++) { $Match0Index = $Matches[$i].Groups[0].Index $Match0Length = $Matches[$i].Groups[0].Length $MatchIndex = $Matches[$i].Groups[$Group].Index $MatchLength = $Matches[$i].Groups[$Group].Length $MatchString = $Matches[$i].Groups[$Group].Value -Replace "`n","`n`t" $NextMatch = $Matches[$i+1] if ($bol[$nl+1] -le $Match0Index) { while ($bol[$nl+1] -le $Match0Index) { $nl++ } $index = $bol[$nl] if ($index -ge $io.Length) { break } $Nmline++ if ($file.Extension -eq ".xlsx" -or $file.Extension -eq ".xlsm") { Write-Host $("{0,5}:" -F $script:cell[$nl]) -NoNewline } else { Write-Host $("{0,5}:" -F ($nl+1)) -NoNewline } } $Nmatch++ if ($Match0Length -gt 0) { $Nmchar += $MatchLength Write-Host $io.SubString($index, $Match0Index - $index) -NoNewline Write-Host $($io.SubString($Match0Index, $MatchIndex - $Match0Index) -Replace "`n","`n`t") ` -Back $BackgroundColor -Fore $ForegroundColor -NoNewline Write-Host $MatchString -Back $BackgroundColor -Fore $CapturegroupColor -NoNewline $index = $Match0Index + $Match0Length $index0 = $MatchIndex + $MatchLength Write-Host $($io.SubString($index0, $index - $index0) -Replace "`n","`n`t") ` -Back $BackgroundColor -Fore $ForegroundColor -NoNewline while ($bol[$nl+1] -le $index) { $nl++ $Nmline++ } } if ($NextMatch -and $NextMatch.Index -lt $bol[$nl+1]) { continue } $eol = $bol[$nl+1] - 1 if ($eol -eq $io.Length) { $eol-- } if ($io[$eol] -eq "`n") { $eol-- } if ($index -le $eol) { Write-Host $io.SubString($index, $eol+1 - $index) } else { Write-Host } } } } else { [void]$script:Html.Append("$($file.FullName)`n") if ($Group -eq "0") { for ([Int]$i=0; $i -lt $Matches.Length; $i++) { $MatchIndex = $Matches[$i].Groups[0].Index $MatchLength = $Matches[$i].Groups[0].Length $MatchString = $Matches[$i].Groups[0].Value -Replace "`n","`n`t" $NextMatch = $Matches[$i+1] if ($bol[$nl+1] -le $MatchIndex) { while ($bol[$nl+1] -le $MatchIndex) { $nl++ } $index = $bol[$nl] if ($index -ge $io.Length) { break } $Nmline++ if ($file.Extension -eq ".xlsx" -or $file.Extension -eq ".xlsm") { [void]$script:Html.Append(("{0,5}:" -F $script:cell[$nl])) } else { [void]$script:Html.Append(("{0,5}:" -F ($nl+1))) } } $Nmatch++ if ($MatchLength -gt 0) { $Nmchar += $MatchLength [void]$script:Html.Append($(Htmlify $io.SubString($index, $MatchIndex - $index))) [void]$script:Html.Append("$(Htmlify $MatchString)") $index = $MatchIndex + $MatchLength while ($bol[$nl+1] -le $index) { $nl++ $Nmline++ } } if ($NextMatch -and $NextMatch.Index -lt $bol[$nl+1]) { continue } $eol = $bol[$nl+1] - 1 if ($eol -eq $io.Length) { $eol-- } if ($io[$eol] -eq "`n") { $eol-- } if ($index -le $eol) { [void]$script:Html.Append("$(Htmlify $io.SubString($index, $eol+1 - $index))`n") } else { [void]$script:Html.Append("`n") } } } else { for ([Int]$i=0; $i -lt $Matches.Length; $i++) { $Match0Index = $Matches[$i].Groups[0].Index $Match0Length = $Matches[$i].Groups[0].Length $MatchIndex = $Matches[$i].Groups[$Group].Index $MatchLength = $Matches[$i].Groups[$Group].Length $MatchString = $Matches[$i].Groups[$Group].Value -Replace "`n","`n`t" $NextMatch = $Matches[$i+1] if ($bol[$nl+1] -le $Match0Index) { while ($bol[$nl+1] -le $Match0Index) { $nl++ } $index = $bol[$nl] if ($index -ge $io.Length) { break } $Nmline++ if ($file.Extension -eq ".xlsx" -or $file.Extension -eq ".xlsm") { [void]$script:Html.Append(("{0,5}:" -F $script:cell[$nl])) } else { [void]$script:Html.Append(("{0,5}:" -F ($nl+1))) } } $Nmatch++ if ($Match0Length -gt 0) { $Nmchar += $MatchLength [void]$script:Html.Append((Htmlify $io.SubString($index, $Match0Index - $index))) [void]$script:Html.Append("$(Htmlify $io.SubString($Match0Index, $MatchIndex - $Match0Index))") [void]$script:Html.Append("$(Htmlify $MatchString)") $index = $Match0Index + $Match0Length $index0 = $MatchIndex + $MatchLength [void]$script:Html.Append("$(Htmlify $io.SubString($index0, $index - $index0))") while ($bol[$nl+1] -le $index) { $nl++ $Nmline++ } } if ($NextMatch -and $NextMatch.Index -lt $bol[$nl+1]) { continue } $eol = $bol[$nl+1] - 1 if ($eol -eq $io.Length) { $eol-- } if ($io[$eol] -eq "`n") { $eol-- } if ($index -le $eol) { [void]$script:Html.Append("$(Htmlify $io.SubString($index, $eol+1 - $index))`n") } else { [void]$script:Html.Append("`n") } } } } } if ($file.Extension -eq ".xlsx" -or $file.Extension -eq ".xlsm") { $script:cell.Clear() } } } Write-Host "$Nmfile file, $Nmline line, $Nmatch string, $Nmchar character matches found." -Fore Green if ($HtmlOutput) { [void]$script:Html.Append("$Nmfile file, $Nmline line, $Nmatch string, $Nmchar character matches found.`n") HtmlFooter try { [Text.Encoding]::UTF8.GetBytes($script:Html.ToString()) | Set-Content -Path $HtmlFile -Encoding Byte # UTF-8N [void]$script:Html.Clear() Write-Host "The result has been output to $HtmlFile." -Fore Green Start-Process -FilePath "file:///$HtmlFile" } catch [System.Exception] { Write-Host "failed to output a HTML file." -Fore Red Write-Host $Error[0].Exception.Message } } } Function IsBinary($File) { if ($File.Length -lt 2) { return $False } if ($File.Length -gt 20000000) { return $True } $bytes = Get-Content $File.FullName -ReadCount 4096 -TotalCount 4096 -Encoding Byte [Int]$Nbo=0 [Int]$Nbe=0 [Int]$Nzo=0 [Int]$Nze=0 for ([Int]$i=0; $i -lt $bytes.Length; $i+=2) { $Nbo++ if ($bytes[$i] -eq 0) { $Nzo++ } } for ([Int]$i=1; $i -lt $bytes.Length; $i+=2) { $Nbe++ if ($bytes[$i] -eq 0) { $Nze++ } } if (($Nzo+$Nze -gt 0) -and ([System.Math]::Abs($Nzo/$Nbo-$Nze/$Nbe)/($Nbo+$Nbe) -lt 0.1)) { return $True } Write-Debug "IsBinary: $($file.Name) $($Nzo+$Nze), $([System.Math]::Abs($Nzo/$Nbo-$Nze/$Nbe)/($Nbo+$Nbe))" return $False } Function GetEncodingFromBOM($File) { $bytes = Get-Content $File.FullName -ReadCount 4 -TotalCount 4 -Encoding Byte $string = ($bytes | %{ "{0:X2}" -F $_ }) -Join "" switch -Regex ($string) { "^EFBBBF" { $enc="UTF8" ; break } "^FFFE0000" { $enc="UTF32" ; break } "^FFFE" { $enc="Unicode" ; break } "^0000FEFF" { $enc="BigEndianUTF32" ; break } "^FEFF" { $enc="BigEndianUnicode" ; break } "^2B2F76(38|39|2B|2F)" { $enc="UTF7" ; break } Default { $enc="" } } Write-Debug "GetEncodingFromBOM: $($File.Name) $($string) $($enc)" return $enc } Function ReadDocxFile($DocxFile) { Add-Type -AssemblyName WindowsBase $file = (Get-Childitem -Path $DocxFile).FullName $package = [System.IO.Packaging.Package]::Open($file, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read) $parts = $package.GetParts() | %{ $_ } $document = $parts | ?{ $_.Uri.OriginalString -eq "/word/document.xml" } $footnotes = $parts | ?{ $_.Uri.OriginalString -eq "/word/footnotes.xml" } $endnotes = $parts | ?{ $_.Uri.OriginalString -eq "/word/endnotes.xml" } $comments = $parts | ?{ $_.Uri.OriginalString -eq "/word/comments.xml" } $enc = [System.Text.Encoding]::UTF8 $sr = New-Object System.IO.StreamReader $document.GetStream(),$enc $text = New-Object System.Text.StringBuilder [void]$text.Append($sr.ReadToEnd()) $sr.Close() if ($footnotes) { $sr = New-Object System.IO.StreamReader $footnotes.GetStream(),$enc [void]$text.Append($sr.ReadToEnd()) $sr.Close() } if ($endnotes) { $sr = New-Object System.IO.StreamReader $endnotes.GetStream(),$enc [void]$text.Append($sr.ReadToEnd()) $sr.Close() } if ($comments) { $sr = New-Object System.IO.StreamReader $comments.GetStream(),$enc [void]$text.Append("" + $sr.ReadToEnd()) $sr.Close() } $package.Close() $t = $text.ToString() [void]$text.Clear() $t = $t -Replace "\r?\n","" $t = $t -Replace ""," " $t = $t -Replace "\s+"," " $t = $t -Replace "","`n" $t = $t -Replace "<[^>]+>","" $t = $t -Replace "&","&" $t = $t -Replace "<","<" $t = $t -Replace ">",">" $t = $t -Replace "(?m)^ ?\r?\n","" return $t } Function ReadPptxFile($PptxFile) { Add-Type -AssemblyName WindowsBase $file = (Get-Childitem -Path $PptxFile).FullName $package = [System.IO.Packaging.Package]::Open($file, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read) $parts = $package.GetParts() | %{ $_ } $enc = [System.Text.Encoding]::UTF8 $text = New-Object System.Text.StringBuilder for ([Int]$i = 1; $i -le 999; $i++) { $slide = $parts | ?{ $_.Uri.OriginalString -eq "/ppt/slides/slide$i.xml" } if (-not $slide) { break } $sr = New-Object System.IO.StreamReader $slide.GetStream(),$enc $tmp = $sr.ReadToEnd() $sr.Close() $sliderels = $parts | ?{ $_.Uri.OriginalString -eq "/ppt/slides/_rels/slide$i.xml.rels" } if ($sliderels) { $sr = New-Object System.IO.StreamReader $sliderels.GetStream(),$enc $rel = [xml]$sr.ReadToEnd() | %{ $_.Relationships.Relationship } | ?{ $_.Type -Match "notesSlide" } $sr.Close() if ($rel) { $nfile = $([io.path]::GetFileName($rel.Target)) $note = $parts | ?{ $_.Uri.OriginalString -eq "/ppt/notesSlides/$nfile" } $sr = New-Object System.IO.StreamReader $note.GetStream(),$enc $tmp += " " + $sr.ReadToEnd() $sr.Close() } } $tmp = $tmp -Replace "\r?\n","" $tmp = $tmp -Replace "\s+"," " [void]$text.Append($tmp + "`n") } $package.Close() $t = $text.ToString() [void]$text.Clear() $t = $t -Replace "<[^>]+>","" $t = $t -Replace "&","&" $t = $t -Replace "<","<" $t = $t -Replace ">",">" return $t } Function ReadXlsxFile($XlsxFile) { Add-Type -AssemblyName WindowsBase $enc = [System.Text.Encoding]::UTF8 $file = (Get-Childitem -Path $XlsxFile).FullName $package = [System.IO.Packaging.Package]::Open($file, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read) $parts = $package.GetParts() | %{ $_ } $text = New-Object System.Text.StringBuilder $script:Cell = New-Object 'System.Collections.Generic.List[System.String]' 100000 $sst = $parts | ?{ $_.Uri.OriginalString -eq "/xl/sharedStrings.xml" } if ($sst) { $strings = New-Object 'System.Collections.Generic.List[System.String]' 5000 $sr = New-Object System.IO.StreamReader $sst.GetStream(),$enc $si = [xml]$sr.ReadToEnd() | %{ $_.sst.si } foreach ($s in $si) { if ($s.t -and $s.t.GetType().Name -eq "String") { $tmp = $s.t } elseif ($s.t."#text") { $tmp = $s.t."#text" } elseif ($s.r) { $tmp = "" foreach ($r in $s.r) { if ($r.t -and $r.t.GetType().Name -eq "String") { $tmp += $r.t } elseif ($r.t."#text") { $tmp += $r.t."#text" } } } $tmp = $tmp -Replace "\r?\n","" $tmp = $tmp -Replace "\s+"," " [void]$strings.Add($tmp) } $sr.Close() } for ([Int]$i = 1; $i -le 999; $i++) { $sheet = $parts | ?{ $_.Uri.OriginalString -eq "/xl/worksheets/sheet$i.xml" } if (-not $sheet) { break } $sr = New-Object System.IO.StreamReader $sheet.GetStream(),$enc $rows = [xml]$sr.ReadToEnd() | %{ $_.worksheet.sheetdata.row } foreach ($row in $rows) { foreach ($col in $row.c) { switch ($col.t) { "s" { [void]$text.Append($strings[$col.v] + "`n") [void]$script:Cell.Add("S$($i.ToString())$($col.r)") } "inlineStr" { if ($col.is.t -and $col.is.t.GetType().Name -eq "String") { $tmp = $col.is.t } elseif ($col.is.t."#text") { $tmp = $col.is.t."#text" } elseif ($col.is.r) { $tmp = "" foreach ($r in $col.is.r) { if ($r.t -and $r.t.GetType().Name -eq "String") { $tmp += $r.t } elseif ($r.t."#text") { $tmp += $r.t."#text" } } } $tmp = $tmp -Replace "\r?\n","" $tmp = $tmp -Replace "\s+"," " [void]$text.Append($tmp + "`n") [void]$script:Cell.Add("S$($i.ToString())$($col.r)") } } } } $sr.Close() } if ($strings) { $strings.Clear() } $NumSheets = ($parts | %{ $_.Uri.OriginalString -Match "/xl/worksheets/sheet[0-9]+.xml" }).Length for ([Int]$i = 1; $i -le $NumSheets; $i++) { $sheetrels = $parts | ?{ $_.Uri.OriginalString -eq "/xl/worksheets/_rels/sheet$i.xml.rels" } if (-not $sheetrels) { continue } $sr = New-Object System.IO.StreamReader $sheetrels.GetStream(),$enc $rel = [xml]$sr.ReadToEnd() | %{ $_.Relationships.Relationship } | ?{ $_.Type -Match "comments" } $sr.Close() if (-not $rel) { continue } $cfile = $([io.path]::GetFileName($rel.Target)) $comment = $parts | ?{ $_.Uri.OriginalString -eq "/xl/$cfile" } $sr = New-Object System.IO.StreamReader $comment.GetStream(),$enc $cc = [xml]$sr.ReadToEnd() | %{ $_.comments.commentList.comment } foreach ($c in $cc) { $tmp = "" foreach ($r in $c.text.r) { if ($r.t -and $r.t.GetType().Name -eq "String") { $tmp += $r.t } elseif ($r.t."#text") { $tmp += $r.t."#text" } } $tmp = $tmp -Replace "\r?\n","" $tmp = $tmp -Replace "\s+"," " [void]$text.Append($tmp + "`n") [void]$script:Cell.Add("S$($i.ToString())$($c.ref)C") } $sr.Close() } $package.Close() $t = $text.ToString() [void]$text.Clear() return $t } Function ReadPdfFile($PdfFile) { if (-not (Test-Path $script:PSScriptRoot\itextsharp.dll)) { Write-Host "Unable to search in $PdfFile because isharptext.dll doesn't exist." -Fore Red return } Add-Type -Path $script:PSScriptRoot\itextsharp.dll $reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $PdfFile $text = New-Object System.Text.StringBuilder for ([Int]$i = 1; $i -le $reader.NumberOfPages; $i++) { $strategy = New-Object iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy [void]$text.Append([iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader, $i, $strategy)) remove-variable strategy } $reader.Close() $t = $text.ToString() [void]$text.Clear() return $t } Function RGBColor([ConsoleColor]$Color) { switch -Exact ($Color) { "Black" { return "#000000" } "DarkBlue" { return "#000080" } "DarkGreen" { return "#008000" } "DarkCyan" { return "#008080" } "DarkRed" { return "#800000" } "DarkMagenta" { return "#012456" } "DarkYellow" { return "#EEEDF0" } "Gray" { return "#C0C0C0" } "DarkGray" { return "#808080" } "Blue" { return "#0000FF" } "Green" { return "#008000" } "Cyan" { return "#00FFFF" } "Red" { return "#FF0000" } "Magenta" { return "#FF00FF" } "Yellow" { return "#FFFF00" } "White" { return "#FFFFFF" } Default { return "" } } } Function HtmlHeader() { $script:Html = New-Object System.Text.StringBuilder [void]$script:Html.Append("`n") [void]$script:Html.Append("`n") # <- You may want to change this [void]$script:Html.Append("`n") [void]$script:Html.Append("`n") [void]$script:Html.Append("RegExS Output`n") [void]$script:Html.Append("`n") [void]$script:Html.Append("`n`n
`n")
}

Function HtmlFooter() {
    [void]$script:Html.Append("
`n`n") } Function Htmlify($t) { $t = $t -Replace "&", "&" $t = $t -Replace ">", ">" $t = $t -Replace "<", "<" $t = $t -Replace """", """ return $t } Function Pause_and_Exit() { Write-Host "Press any key to exit . . ." -Fore Green $Host.UI.RawUI.FlushInputBuffer() $Host.UI.RawUI.ReadKey("NoEcho,IncludeKeyUp") | Out-Null exit } Function Main() { if ($DoubleClicked) { $Interactive = $True } if ($Interactive) { if ($Pattern) { Write-Host "Enter a search text. [$Pattern]:" -NoNewline -Fore Green $Answer = "" try { $Answer = Read-Host } catch [System.Exception] {} if ($Answer) { $Pattern = $Answer } } else { Write-Host "Enter a search text.:" -NoNewline -Fore Green try { $Pattern = Read-Host } catch [System.Exception] {} if (-not $Pattern) { Pause_and_Exit } } Write-Host "Enter a capture-group name or number. [$Group]:" -NoNewline -Fore Green $Answer = "" try { $Answer = (Read-Host).Trim() } catch [System.Exception] {} if ($Answer) { $Group = $Answer } Write-Host "Select a target folder." -Fore Green Add-Type -AssemblyName System.Windows.Forms $fbd = New-Object System.Windows.Forms.FolderBrowserDialog $fbd.ShowNewFolderButton = $false $fbd.Description = "Select a target folder." if ($Dir -and (Test-Path -LiteralPath $Dir)) { $fbd.SelectedPath = Convert-Path -LiteralPath $Dir } else { $fbd.SelectedPath = [string]$PWD } $Result = $fbd.ShowDialog((New-Object System.Windows.Forms.Form -Property @{TopMost = $true})) if ($Result -eq [System.Windows.Forms.DialogResult]::OK) { $Dir = $fbd.SelectedPath } else { Write-Host "invalid input" -Fore Red; Pause_and_Exit } $fbd.Dispose() #Begin Get-WindowFocus Add-Type @" using System; using System.Runtime.InteropServices; public class SFW { [DllImport("user32.dll")] [return: MarshalAs(UnmanagedType.Bool)] public static extern bool SetForegroundWindow(IntPtr hWnd); } "@ $PPID=$PID For ([Int]$i=0; $i -lt 2; $i++) { $PPID=(Get-WmiObject Win32_Process -Filter "ProcessID=$PPID").ParentProcessID try { $WindowTitle = (Get-Process -ID $PPID -ErrorAction SilentlyContinue).MainWindowTitle $WindowHandle = (Get-Process -ID $PPID -ErrorAction SilentlyContinue).MainWindowHandle if ($WindowTitle) { [void][SFW]::SetForegroundWindow($WindowHandle) break } } catch [System.Exception] { break } } #End Get-WindowFocus Write-Host "Include sub-folders? Y/N [$(if($Recurse){"Y"}else{"N"})]:" -NoNewline -Fore Green try { $Answer = (Read-Host).Trim().ToUpper() } catch [System.Exception] {} switch ($Answer) { "Y" { $Recurse = $True } "N" { $Recurse = $False } "" { } default { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } Write-Host "Enter file names. (wildcard allowed) [$Include]:" -NoNewline -Fore Green $Answer = "" try { $Answer = (Read-Host).Trim() } catch [System.Exception] {} if ($Answer) { $Include = $Answer } Write-Host "Specify the other options? Y/N [N]:" -NoNewline -Fore Green $Answer = "" try { $Answer = (Read-Host).Trim().ToUpper() } catch [System.Exception] {} if ($Answer -eq "Y") { Write-Host "Output the search result to a HTML file? Y/N [$(if($HtmlOutput){"Y"}else{"N"})]:" -NoNewline -Fore Green try { $Answer = (Read-Host).Trim().ToUpper() } catch [System.Exception] {} switch ($Answer) { "Y" { $HtmlOutput = $True } "N" { $HtmlOutput = $False } "" { } default { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } Write-Host "Disable regular expression search? Y/N [$(if($SimpleMatch){"Y"}else{"N"})]:" -NoNewline -Fore Green try { $Answer = (Read-Host).Trim().ToUpper() } catch [System.Exception] {} switch ($Answer) { "Y" { $SimpleMatch = $True } "N" { $SimpleMatch = $False } "" { } default { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } Write-Host "Ignore upper/lower cases? Y/N [$(if($IgnoreCase){"Y"}else{"N"})]:" -NoNewline -Fore Green try { $Answer = (Read-Host).Trim().ToUpper() } catch [System.Exception] {} switch ($Answer) { "Y" { $IgnoreCase = $True } "N" { $IgnoreCase = $False } "" { } default { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } Write-Host "Convert wide characters into narrow ones before search? Y/N [$(if($Narrow){"Y"}else{"N"})]:" -NoNewline -Fore Green try { $Answer = (Read-Host).Trim().ToUpper() } catch [System.Exception] {} switch ($Answer) { "Y" { $Narrow = $True } "N" { $Narrow = $False } "" { } default { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } Write-Host "Enter file names to be excluded. (wildcard allowed) [$Exclude]:" -NoNewline -Fore Green $Answer = "" try { $Answer = (Read-Host).Trim() } catch [System.Exception] {} if ($Answer) { $Exclude = $Answer } Write-Host "Enter a character encoding for the files without BOMs. [$Encoding]:" -NoNewline -Fore Green $Answer = "" try { $Answer = (Read-Host).Trim() } catch [System.Exception] {} if ($Answer) { $Encoding = $Answer $EncodingList = @("unkown", "string", "unicode", "byte", "bigendianunicode", ` "utf8", "utf7", "utf32", "ascii", "default", "oem", "bigendianutf32") if (-not ($EncodingList -Contains $Encoding.ToLower())) { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } } elseif ($Answer -ne "N" -and $Answer -ne "") { Write-Host "invalid input" -Fore Red; Pause_and_Exit } } RegExS $Pattern $Dir $Include -e $Encoding -g $Group -h:$HtmlOutput -i:$IgnoreCase -n:$Narrow -r:$Recurse ` -s:$SimpleMatch -x $Exclude ` -bc $BackgroundColor -cc $CapturegroupColor -fc $ForegroundColor if ($Interactive) { Pause_and_Exit } } Main