:Class regexMatch ⍝ This class generates instances based on the .Net Regex class :include listsUtils ⍝∇:require =\listsUtils :include partScan ⍝∇:require =\partScan :include textUtils ⍝∇:require =\textUtils ⎕io←0 ⋄ ⎕ml←1 ⋄ ⎕wx←3 ⋄ (LF CR)←2↓4↑⎕av ∇ genpat pattern :Implements constructor :Access public cPattern←'m'compileRegex pattern ⎕DF pattern ∇ ∇ genpatopt(pattern options);arg :Implements constructor :Access public cPattern←options compileRegex pattern ⎕DF pattern ∇ ∇ letters←extraNameFormingChars;r ⍝ Returns the extra characters used to form names in APL ⍝ The following varies from one APL to another but ⎕NC=0 always ⍝ means available and in Dyalog an invalid name has a ⎕NC of ¯1 r←0≤⎕NC 256 1⍴⎕AV ⍝ check validity letters←(r/⎕AV)~,⎕AV[(⎕AV⍳'Aa')∘.+⍳26] ⍝ assume alphabets together ∇ ∇ rx←{options}compileRegex pattern;⎕USING;rxopt;opt ⍝ This function returns a .Net regex object from a pattern ⍝ 'options' are ⍝ M: Multiline search, S: Singleline search, I: case Insensitive ⎕USING←'System.Text.RegularExpressions,system.dll' ⍝ Determine the options :If 0=⎕NC'options' ⋄ options←'' ⋄ :EndIf opt←opt∨(⍴opt)↑~∨/opt←∨⌿2 3⍴'msiMSI'∊options ⍝ Multiline default rxopt←+/opt/RegexOptions.(Multiline Singleline IgnoreCase) rx←⎕NEW Regex,⊂(APLPattern pattern)rxopt ∇ ⍝ ------- Instance methods ------- ∇ pattern←APLPattern pattern;⎕USING;⎕IO;e;sl;nobsl;w;nfc;cut ⍝ This function is used to return a pattern a la APL ⎕IO←0 ⋄ cut←sl>¯1↓0,sl←'\'=pattern nobsl←¯1↓1,sl⍲cut pUes sl ⍝ where there is no unescaped \ before e←(⍴,pattern)↑'∧'=1↑pattern ⍝ BOL anchor ⍝ There is a known problem with some ASCII characters like ∧: ⍝ ∧ means "start of string" at the beginning or "NOT" inside brackets in which case ⍝ it must be converted to its ASCII counterpart ((e∨¯1⌽nobsl∧'[∧'⍷pattern)/pattern)←⎕AV[235] ⍝ map ∧ to ASCII caret if no translation done ⍝ This version accepts sequences ⍺ & ⍵ as shorcuts for "token" and "number" ⍝ Like any metacharacter they can be disabled using '\' w←nobsl∧'⍵'=pattern ⍝ find numbers (⍵) before ⍺ tokens nfc←'a-zA-Z',extraNameFormingChars :If 1∊e←nobsl∧'⍺'=pattern ⍝ in Dyalog a name must start with a letter followed by 0 or more alphanums ⍝ or it can be ⍺, ⍺⍺, ⍵ or ⍵⍵. It must not be preceded by : either (e.g. :IF) (e/pattern)←⊂'(?>(?(?\d+\.?\d*|\d*\.?\b\d+)(?>[eE]¯?\d+)?)' ⍝ atomic grouping pattern←0⊃,,/pattern,⊂'' ⍝ cover empty case pattern←(~'\⍺'⍷e)/e←(~'\⍵'⍷e)/e←pattern ⍝ remove unnecessary \s ∇ ∇ r←{options}findMatch string;e;mo;np;msg ⍝ This function is used to find where a pattern is found in a string. ⍝ It returns an int list indicating the start & length of each match. ⍝ Each subexpression specified is shown for each match (3D result). :Access public {}⎕FX,⊂'z←options z' :Trap 90 ⍝ Change CR into NL (needed for search) mo←cPattern.Match⊂string charReplace CR LF r←0⍴⊂0 2⍴0 ⍝ Execute the expression until nothing found :While mo.Success np←mo.Groups.Count r←r,⊂↑mo.Groups[⍳np].(Index Length) mo←mo.NextMatch :EndWhile r←↑r ⍝ disclose results :Else msg←{256>⍴⍵:⍵ ⋄ '...',¯252↑⍵}(e⍳CR)↑e←⍕⎕EXCEPTION msg ⎕SIGNAL 11 :EndTrap ∇ ∇ text←showMatches string :Access public text←displayMatch cPattern string ∇ ∇ r←{options}displayMatch(pattern string);⎕USING;⎕IO;⎕ML;⎕WX;e;mo;msg;cpattern;⎕TRAP;lno;eachline;offset;marks;lines;startpos;ind;len;lel;move;dec;hits ⍝ This function is used to show where a pattern is found in a string. ⍝ It displays each line with carets under where the match is made ⍝ This fn is ⎕io independent. ⍝ Pattern may be compiled ⎕ML←3 ⋄ ⎕IO←0 ⋄ ⎕WX←3 :If 0=⎕NC'options' ⋄ options←'l' ⋄ :EndIf ⍝ Change CR into NL (needed for search) string←intoCR string startpos←0,1+e/⍳⍴e←string=CR ⍝ where each line starts lel←1+∊⍴¨eachline←string splitOn CR ⍝ eachline and its length ⍝ Check that the pattern is compiled ⎕TRAP←90 'C' '→err90' :If 2 0∨.=10|⎕DR,cpattern←pattern cpattern←options compileRegex pattern :EndIf mo←cpattern.Match⊂string charReplace CR LF lines←marks←r←⍴hits←0 ⍝ Execute the expression until nothing found :While mo.Success (ind len)←mo.Groups[0].(Index Length) :If len>0 ⍝ no point tracking 0 length matches lines←lines,lno←¯1+⊃fromTo/+⌿startpos∘.≤ind+0,len offset←ind-startpos[lno[0]] ⍝ # spaces before marks←marks,(offset len/0 1)splitAt+\lel[¯1↓lno] :EndIf mo←mo.NextMatch hits+←1 :EndWhile ⍝ All lines # and their marks have been gathered marks←\∘'∧'¨∨⌿∘⊃¨(1+lines)⊂marks ⍝ 'lines' must be >0 lno←∪lines eachline←eachline[lno] move←'' :If ∨/'lL'∊options dec←'[',⊃,∘'] '¨⍕¨lno ⍝ add line #s? move←(¯1↑⍴dec)↑'' eachline←eachline,¨⍨↓dec :EndIf r←r,∊CR,¨eachline,[0.2]move∘,¨marks ⍝ Add number found r←((⍕hits),' match',(2×hits=1)↓'es found'),r →0 err90: msg←{256>⍴⍵:⍵ ⋄ '...',¯252↑⍵}(e⍳CR)↑e←⍕⎕EXCEPTION msg ⎕SIGNAL 11 ∇ ∇ r←Replace(text by) ⍝ Same as the Regex Replace function :Access public r←cPattern.Replace text by ∇ :EndClass