Finding and Highlighting All Instances of a Pattern

 Finding and Highlighting All Instances of a Pattern


PROBLEM

You want to find all instances of a pattern within a string. 


Solution

Use the RegExp exec method and the global flag (g) in a loop to locate all instances of a pattern, such as any word that begins with t and ends with e, with any number of characters in between:

var searchString = "Now is the time and this is the time and that is the time"; 
var pattern = /t\w*e/g;
var matchArray; 


var str = ""; 


// check for pattern with regexp exec, if not null, process 
while((matchArray = pattern.exec(searchString)) != null) {
str+="at " + matchArray.index + " we found " + matchArray[0] + "\n"; 

console.log(str);
The results are:

at 7 we found the 
at 11 we found time 
at 28 we found the 
at 32 we found time 
at 49 we found the 
at 53 we found time 

EXPLAIN

The RegExp exec() method executes the regular expression, returning null if a match is not found, or an object with information about the match, if found. Included in the returned array is the actual matched value, the index in the string where the match is found, any parenthetical substring matches, and the original string:

• index: The index of the located match 
• input: The original input string 
• [0]: The matched value 
• [1],…,[n]+: Parenthesized substring matches, if any 

The parentheses capture the matched values. Given a regular expression like that in the following code snippet: 

var re = /a(p+).*(pie)/ig; 
var result = re.exec("The apples in the apple pie are tart"); 
console.log(result); 
console.log(result.index); 
console.log(result.input);


the resulting output is:

["apples in the apple pie", "pp", "pie"]
4 
"The apples in the apple pie are tart"
 
The array results contain the complete matched value at index zero (0), and the rest of the array entries are the parenthetical matches. The index is the index of the match, and the input is just a repeat of the string being matched. In the solution, the index where the match was found is printed out in addition to the matched value. 

The solution also uses the global flag (g). This triggers the RegExp object to preserve the location of each match, and to begin the search after the previously discovered match. When used in a loop, we can find all instances where the pattern matches the string. In the solution, the following are printed out:

at 7 we found the 
at 11 we found time 
at 28 we found the 
at 32 we found time
at 49 we found the 
at 53 we found time

Let’s look at the nature of global searching in action. In Example 1-1, a web page is created with a textarea and an input text box for accessing both a search string and a pattern. The pattern is used to create a RegExp object, which is then applied against the string. A result string is built, consisting of both the unmatched text and the matched text, except the matched text is surrounded by a span element (with a CSS class used to highlight the text). The resulting string is then inserted into the page, using the innerHTML for a div element.


Using exec and global flag to search and highlight all matches in a text
string



<!DOCTYPE html>

<html>

<head>

<title>Searching for strings</title>

<style>

.found

{

 background-color: #ff0;

}

</style>

</head>

<body>

 <form id="textsearch">

 <textarea id="incoming" cols="150" rows="10">

 </textarea>



 <p>









Search pattern:


 <input id="pattern" type="text" />

 </p>

 </form>

 <button id="searchSubmit">Search for pattern</button>

 <div id="searchResult"></div>

<script>

 document.getElementById("searchSubmit").onclick=function() {

 // get pattern

 var pattern = document.getElementById("pattern").value;

 var re = new RegExp(pattern,"g");

 // get string

 var searchString = document.getElementById("incoming").value;

 var matchArray;

 var resultString = "<pre>";

 var first=0;

 var last=0;

 // find each match

 while((matchArray = re.exec(searchString)) != null) {

 last = matchArray.index;

 // get all of string up to match, concatenate

 resultString += searchString.substring(first, last);

 // add matched, with class

 resultString += "<span class='found'>" + matchArray[0] + "</span>";

 first = re.lastIndex;

 }

 // finish off string

 resultString += searchString.substring(first,searchString.length);

 resultString += "</pre>";

 // insert into page

 document.getElementById("searchResult").innerHTML = resultString;

}

</script>

</body>



</html>

The bar (|) is a conditional test, and will match a word based on the value on either side of the bar. So leaf matches, as well as leaves, but not leap.

You can access the last index found through the RegExp’s lastIndex property. The lastIndex property is handy if you want to track both the first and last matches.




0 comments:

Post a Comment