Get Programming Language Keywords By Using Regular Expression in Java

We can use regular expression to get all Java keywords in a program. The key is using word boundary correctly. For example, given “static staticField”, the first word should be recognized as a keyword but the second should not.

Here is the code:

package regex;
 
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang.StringUtils;
 
public class RegTest {
	public static void main(String[] args) {
		String keyString = "abstract assert boolean break byte case catch "
				+ "char class const continue default do double else enum"
				+ " extends false final finally float for goto if implements "
				+ "import instanceof int interface long native new null " 
				+ "package private protected public return short static "
				+ "strictfp super switch synchronized this throw throws true " 
				+ "transient try void volatile while";
		String[] keys = keyString.split(" ");
		String keyStr = StringUtils.join(keys, "|");
 
		String regex = "\\b("+keyStr+")\\b";
		String target = "static public staticpublic void main()";
		Pattern p = Pattern.compile(exp);
		Matcher m = p.matcher(target);
 
		while(m.find()){
			System.out.println("|"+m.group()+"|");
			System.out.println(m.start());
			System.out.println(m.end());
		}
	}
}

Output:

|static|
0
6
|public|
7
13

Check out Boundary Matcher for more examples.

1 thought on “Get Programming Language Keywords By Using Regular Expression in Java”

  1. This example is terrible…

    1) ‘exp’ is not a variable that you have defined anywhere, so it can’t even compile.

    2) Why are you including a full library for something as simple as this? This leads into #3…

    3) Your splitting with regex is OVERKILL. Why are you calling what are likely expensive functions compared to:
    String keyStr = keyString.replace(‘ ‘, ‘|’);
    Hell, why not just put the pipes in the text itself?

    4) Your “static public staticpublic void main()” directly shows your code is broken!
    It returns:
    |static|
    0
    6
    |public|
    7
    13
    |static|
    14
    20
    |public|
    20
    26
    |void|
    27
    31

    5) Furthermore, what if you were parsing this string?
    “public static void test(String txt) { return txt + “there is a null word in here”; }”;
    Guess what? You just found a keyword that is not a keyword inside a String:
    |public|
    0
    6
    |static|
    7
    13
    |void|
    14
    18
    |return|
    38
    44
    |null| <– UH OH!!!
    63
    67

    Can you at least compile your examples BEFORE you upload them online?

    You actually typed more code, included a full blown library that you didn't need to, AND wrote code that likely takes significantly longer to run for no reason.

    Anyways, I'm trying not to be too negative but damn… looking at all your other examples on this site has me convinced this is a troll site designed to ruin upcoming java developers with garbage code and incorrect methods.

Leave a Comment