Leetcode – Word Break (Java)

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

For example, given
s = "leetcode",
dict = ["leet", "code"].

Return true because "leetcode" can be segmented as "leet code".

1. Naive Approach

This problem can be solve by using a naive approach, which is trivial. A discussion can always start from that though.

public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
             return wordBreakHelper(s, dict, 0);
    }
 
    public boolean wordBreakHelper(String s, Set<String> dict, int start){
        if(start == s.length()) 
            return true;
 
        for(String a: dict){
            int len = a.length();
            int end = start+len;
 
            //end index should be <= string length
            if(end > s.length()) 
                continue;
 
            if(s.substring(start, start+len).equals(a))
                if(wordBreakHelper(s, dict, start+len))
                    return true;
        }
 
        return false;
    }
}

Time is O(n^2) and exceeds the time limit.

2. Dynamic Programming

The key to solve this problem by using dynamic programming approach:

  • Define an array t[] such that t[i]==true => 0-(i-1) can be segmented using dictionary
  • Initial state t[0] == true
public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
        boolean[] t = new boolean[s.length()+1];
        t[0] = true; //set first to be true, why?
        //Because we need initial state
 
        for(int i=0; i<s.length(); i++){
            //should continue from match position
            if(!t[i]) 
                continue;
 
            for(String a: dict){
                int len = a.length();
                int end = i + len;
                if(end > s.length())
                    continue;
 
                if(t[end]) continue;
 
                if(s.substring(i, end).equals(a)){
                    t[end] = true;
                }
            }
        }
 
        return t[s.length()];
    }
}

Time: O(string length * dict size).

3. Java Solution 3 - Simple and Efficient

In Solution 2, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).

public boolean wordBreak(String s, Set<String> wordDict) {
    int[] pos = new int[s.length()+1];
 
    Arrays.fill(pos, -1);
 
    pos[0]=0;
 
    for(int i=0; i<s.length(); i++){
        if(pos[i]!=-1){
            for(int j=i+1; j<=s.length(); j++){
                String sub = s.substring(i, j);
                if(wordDict.contains(sub)){
                    pos[j]=i;
                }
            } 
        }
    }
 
    return pos[s.length()]!=-1;
}

4. The More Interesting Problem

The dynamic solution can tell us whether the string can be broken to words, but can not tell us what words the string is broken to. So how to get those words?

Check out Word Break II.

Category >> Algorithms  
If you want someone to read your code, please put the code inside <pre><code> and </code></pre> tags. For example:
<pre><code> 
String foo = "bar";
</code></pre>

  1. Callus on 2014-1-10

    I don’t think looping through the dic is a good idea.

  2. Kizzle on 2014-2-8

    Do you know if a better one exists? Can you provide a better solution? Appreciate it!

  3. Dynkin on 2014-2-12

    You should skip the string comparison in the last IF condition if t[end] is already true.

  4. ryanlr on 2014-2-13

    I don’t get what you mean, can you explain in more detail? Thanks.

  5. jk451 on 2014-2-26

    Thanks for these solutions. Just starting to go through the problems but looks like very useful website.

    As for how to get the words that the string breaks up to:
    Change the “t” array to integer instead of boolean.

    Replacing setting t[end] to true (i.e. saying you have found a break up of 0..end substring] with setting t[end] to i, thus saying you have found a break up of 0..end substring and the last word in that break up is substring i..end of the main string.

    Then at the end if I can break up the string, I check t[s.length()]. THe last word in the break up will substring starting at t[s.length()] and ending at s.length()-1. And you repeate this procedure to get the other words.

  6. ryanlr on 2014-2-26

    Seems good to me, I will try later. Thanks!

  7. jason on 2014-3-13

    Another approach
    package test;

    import java.util.HashSet;
    import java.util.Set;

    public class WordBreak2 {

    public static boolean wordBreak(String s,Set dict) {
    if (s.length()==0) {
    return true;
    }
    for(int i=1; i<=s.length(); i++) {
    String firstWord=s.substring(0, i);
    String remaing=s.substring(i);
    if (dict.contains(firstWord) && wordBreak(remaing, dict) ) {
    System.out.print(" ");
    System.out.print(firstWord);

    return true;
    }
    }
    return false;
    }
    public static void main(String[] args) {
    Set dict=new HashSet(5);
    dict.add(“program”);
    if (wordBreak(“pprogram”, dict)) {
    System.out.println(” YES”);
    } else {
    System.out.println(” NO”);
    }

    dict=new HashSet(5);
    dict.add(“ab”);
    dict.add(“abc”);
    dict.add(“de”);
    if (wordBreak(“abcde”, dict)) {
    System.out.println(” YES”);
    } else {
    System.out.println(” NO”);
    }

    }

    }

    This is more efficient if dict is big which is usually

  8. shreyas KN on 2014-3-26

    we can do it in O(n) right , assuming if Set dic is actually a Hashset, so the retrival on Hashset is always O(1)

    List arr;
    StringBuilder sb = new StringBuilder();
    int i=0;
    int wordIndex=0;
    while(i>s.length){

    if(dic.get(sb.substring(wordIndex,i) != null){
    wordIndex=i+1;
    }else{
    arr.add(sb.substring(wordIndex, i));
    }
    i++;
    }

  9. Pan on 2014-5-4

    recursive solution

    public static boolean wordBreak(String s, Set dict){

    //input validation

    //Base case

    if(dict.contains(s))

    return true;

    else {

    for(int i = 0; i < s.length(); i++){

    String sstr = s.substring(0, i);

    if(dict.contains(sstr))

    return wordBreak(s.substring(i), dict);

    }

    }

    return false;

    }

  10. SK on 2014-5-22

    Recursion with hashMap:

    class Solution:

    def __init__(self):

    self.table = {}

    def wordBreak(self, s, dict):

    if len(s) == 0:

    return True

    if len(s) == 1:

    return s in dict

    if s in self.table:

    return self.table[s]

    isBreakable = False

    for i in range(len(s)):

    word = s[:i+1]

    if word in dict:

    subFlag = self.wordBreak(s[i+1:], dict)

    if s[i+1:] not in self.table:

    self.table[s[i+1:]] = subFlag

    isBreakable |= subFlag

    return isBreakable

  11. SK on 2014-5-22

    Use a HasMap can reduce repeated calculation

  12. codingfacts on 2014-6-9

    complexity of naïve is O(n^2) not O(2^n)

  13. Andrei on 2014-6-22

    Small correction of complexity in the 2nd case.

    It’s mentioned that “Time: O(string length * dict size)” but you also run equals (and substring is not constant for Java > 1.6) for every word in dictionary so it’s more like O(string length * dict size * length of the longest word in dict).

  14. Vivek Venkatesh on 2014-7-8

    I can think of trie based solution:

    Time Complexity = O(n) + O(m)

    Space Complexity = O(m)

    Let me if this will work:

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = '^';
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }

    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }
    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = 'A';
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val - start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));
    current = current.children[index];
    }
    current.isLeaf = true;
    }

    }

    public class Example {
    boolean wordBreak(String s, Set dict) {

    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);

    }

    TrieNode current = a.root;
    int start = 'A';
    boolean result = false;

    for(int i=0; i<s.length();i++) {

    int val = Character.toUpperCase(s.charAt(i));
    int index = val - start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }

    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }
    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set dict = new HashSet();
    dict.add("leet");
    dict.add("code");
    dict.add("programcree");
    dict.add("program");
    dict.add("creek");

    System.out.println("Wordbreak (programcreek) = " + temp.wordBreak("programcreek", dict));
    System.out.println("Wordbreak (leetcode) = " + temp.wordBreak("leetcode", dict));

    System.out.println("Wordbreak (lesscode) = " + temp.wordBreak("lesscode", dict));

    }

    }

  15. Guest on 2014-7-8

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = '^';
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }
    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }

    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = 'A';
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val - start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));

    current = current.children[index];
    }
    current.isLeaf = true;
    }
    }

    class Example {

    boolean wordBreak(String s, Set dict) {
    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);
    }

    TrieNode current = a.root;
    int start = 'A';

    boolean result = false;

    for(int i=0; i<s.length();i++) {
    int val = Character.toUpperCase(s.charAt(i));
    int index = val - start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }
    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }

    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set dict = new HashSet();
    dict.add("leet");
    dict.add("code");
    dict.add("programcree");
    dict.add("program");
    dict.add("creek");
    System.out.println("Wordbreak (programcreek) = " + temp.wordBreak("programcreek", dict));
    System.out.println("Wordbreak (leetcode) = " + temp.wordBreak("programcreek", dict));
    }

    }

  16. Vivek Venkatesh on 2014-7-8

    I guess this can be solved by using Tries also.

    Time Complexity : O(n) + O(m)
    Space Complexity : O(m)

    Let me know if the following code will work for all cases.

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = ‘^’;
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }
    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }

    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = ‘A’;
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val – start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));

    current = current.children[index];
    }
    current.isLeaf = true;
    }
    }

    class Example {

    boolean wordBreak(String s, Set<String> dict) {
    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);
    }

    TrieNode current = a.root;
    int start = ‘A’;

    boolean result = false;

    for(int i=0; i<s.length();i++) {
    int val = Character.toUpperCase(s.charAt(i));
    int index = val – start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }
    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }

    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set<String> dict = new HashSet<String>();
    dict.add(“leet”);
    dict.add(“code”);
    dict.add(“programcree”);
    dict.add(“program”);
    dict.add(“creek”);
    System.out.println(“Wordbreak (programcreek) = ” + temp.wordBreak(“programcreek”, dict));
    System.out.println(“Wordbreak (leetcode) = ” + temp.wordBreak(“programcreek”, dict));
    }

    }

  17. ravio on 2014-7-10

    Apparetly, this algorithm is not correct.

    If you use this dictory

    dict.add(“leet”);

    dict.add(“code”);

    dict.add(“lee”);

    dict.add(“programcree”);

    dict.add(“program”);

    dict.add(“creek”);

    , then if will not recognize “leetcode”.

  18. AlgorithmFreak on 2014-8-8

    Not true for post java7. From Java 7, substring() is a O(n) operation!

  19. german on 2014-9-24

    One of the questions will be: Can we use the same dictionary word more than once? For example, if we have the word “leetcodeleetcode” and the dictionary have the words {“leet”, “code”}, the result will be true?

  20. german on 2014-9-24

    Another solution , O(n^3), being n the length() of s. I assume that the set is a hashSet.

    public boolean wordBreak(String s, Set dict) {

    if(s == null || s.length() == 0){

    return true;

    }

    boolean[] arr = new boolean[s.length()];

    if(dict.contains(s.charAt(0)+””)){

    arr[0] = true;

    }

    for(int i = 1;i<arr.length;i++){

    if (dict.contains(s.substring(0, i + 1))) {

    arr[i] = true;

    }

    for (int j = 0; j < i; j++) {

    int a = i – j + 1;

    String subWord = s.substring(j + 1, j + a);

    if (dict.contains(subWord) && arr[j]) {

    arr[i] = true;

    j = i;

    break;

    }

    }

    }

    return arr[arr.length-1];

    }

  21. Guest on 2014-11-1

    {

    if (s == null || s.length() == 0 || dict.isEmpty()) {

    return false;

    }

    for (String w : dict) {

    if (s.equals(w)) {

    return true;

    } else if (s.startsWith(w)) {

    String newS = new String(s);

    do {

    newS = newS.replace(w, “”);

    } while (newS.startsWith(w));

    if (newS.equals(“”)) {

    return true;

    }

    boolean result = this.wordBreak(newS, dict);

    if (result) {

    return true;

    }

    }

    }

    return false;

    }

  22. nikhil rao on 2014-11-10

    /* package whatever; // don’t place package name! */

    import java.util.*;

    import java.lang.*;

    import java.io.*;

    class Wordbreak

    {

    public static boolean wordBreak(String s, String[] dict){

    int counter = 0;

    System.out.println(“Given string to beak :”+ s);

    for(String ds : dict){

    //System.out.println(“dict string :”+ ds);

    if(s.contains(ds)){

    counter = counter +1;

    //System.out.println(“Index of :”+ ds +” ” + s.indexOf(ds));

    int strtindex = s.indexOf(ds);

    int len = ds.length();

    String sb = s.substring( strtindex, strtindex+len);

    //System.out.println(“Am der “+ sb);

    //System.out.println(“dict string :”+ ds);

    }

    }

    if (counter>0){

    return true;

    }

    return false;

    }

    public static void main (String[] args) throws java.lang.Exception

    {

    String s = “programcreek”;

    String[] dict = new String[]{“programcree”,”program”,”creek”};

    boolean b = wordBreak(s, dict);

    if(b){

    System.out.println(“Can be done”);

    }

    else{

    System.out.println(“Not possible!!”);

    }

    }

    }

  23. apricot on 2014-12-8

    Because you skipped the last IF condition if t[end] is already true, now all possibilities are not given.

    For INPUT: “leetcode”, [“leetcode”,”leet”,”code”].

    matches are : leetcode leet

    code is not given as t[end] is made true by match leetcode.

  24. JY on 2014-12-27

    Can anyone clarify the complexity of naïve?

  25. Aaron Zhang on 2015-1-3

    not true if dict = {“a”, “ab”}

  26. Dhaval Dave on 2015-1-4

    DP and Recursive Solution with working code at http://www.gohired.in/2014/12/word-break-problem.html

  27. hdante on 2015-1-10

    The problem is supposed to be equivalent to matching the regexp (leet|code)*, which means that it can be solved by building a DFA in O(2^m) and executing it in O(n)

  28. Andrey Sh on 2015-1-26

    Very short Python solution, also using trie:

    class WordSplitTreeNode:

    def __init__(self):

    self.children = [None for i in range(ord(“z”) – ord(“a”) + 1)]

    self.final = False

    class WordSplitTree:

    def __init__(self):

    self.root = WordSplitTreeNode()

    def addWord(self, word):

    node = self.root

    for c in word:

    i = ord(c) – ord(“a”)

    if node.children[i] is None:

    node.children[i] = WordSplitTreeNode()

    node = node.children[i]

    node.final = True

    def iterate(self, word, pos, node):

    if node.final and pos == len(word):

    yield “”

    return

    nextNodes = []

    i = ord(word[pos]) – ord(“a”)

    if node.final:

    nextNodes.append(self.root.children[i])

    nextNodes.append(node.children[i])

    for nextNode in nextNodes:

    if not nextNode is None:

    for s in self.iterate(word, pos + 1, nextNode):

    yield (” ” if node.final and nextNode is self.root.children[i] else “”) + word[pos] + s

    def checkWordCanBeSplit(word, dictWords):

    tree = WordSplitTree()

    for w in dictWords:

    tree.addWord(w)

    return [splitWord for splitWord in tree.iterate(word, 0, tree.root)]

  29. Truong Khanh Nguyen on 2015-2-24

    Thanks for your nice & complete post. Verify the validity of a string is easy. It is more complex to split a valid string into words. My discussion and java program can be found here http://www.capacode.com/?p=335

  30. ryanlr on 2015-2-27

    Your solution does not pass leetcode online judge.

    Input: “goalspecial”, [“go”,”goal”,”goals”,”special”]
    Output: false
    Expected: true

  31. burdz on 2015-3-4

    I think instead of returning wordBreak(s.substring(i), dict) you need to have that in the if statement with dict.contains(sstr). Otherwise your function returns to early in some cases

    if (dict.contains(sstr) && wordBreak(s.substring(i), dict)) return true;

  32. Pan on 2015-3-7

    Yes, the wordBreak(s.substring(i), dict) should put in the if condition. However, my solution cannot pass the latched online judge.

  33. Puneet on 2015-3-8

    how about this?

    import java.util.*;

    public class WordBreak {
    static Set dictionary = new HashSet();
    public static void main (String[] args) {
    initializeDictionary();
    System.out.println(wordBreak(“leetcodesamsung”));
    }

    private static Boolean wordBreak(String s) {
    Boolean[] memo = new Boolean[s.length()];
    Arrays.fill(memo, Boolean.FALSE);
    int startIndex = 0;
    for (int i = 0; i < s.length(); i++) {
    int endIndex = i+1;
    if (!memo[i] && dictionary.contains(s.substring(startIndex,endIndex)))
    {
    memo[i] = true;
    startIndex = endIndex;
    }
    }

    return memo[s.length() – 1];
    }

    private static void initializeDictionary() {
    dictionary.add("le");
    dictionary.add("et");
    dictionary.add("code");
    dictionary.add("samsun");
    dictionary.add("g");
    }
    }

  34. Abhay on 2015-3-11

    The dynamic solution fails for the case
    s = “aaaab”
    dict = [“a”, “aa”, “ac”]

    It returns true but should return false.

  35. Stephen Boesch on 2015-4-11

    A Trie is a better solution than DP for this problem.

  36. Stephen Boesch on 2015-4-11

    Yes, i had commented that a Trie were a better solution – before seeing you have already posted it.

  37. Rahul Shukla on 2015-12-14

    http://www.ideserve.co.in/learn/word-break-problem
    Here is a detailed explanation of the algorithm.

  38. Jangku on 2016-6-14

    This implementation looks neat. But the complexity is exponential so I would choose polynomial implementation for my case.

  39. KARTHIKEYAN DEVENDRAN on 2016-7-7


    public boolean wordBreak(String s,Set dict){
    HashMap map = new HashMap();
    return wordBreak(s, dict,map);
    }

    public boolean wordBreak(String s,Set dict,HashMap map){

    if(map.containsKey(s)){
    return map.get(s);
    }

    for (int i = 0; i <= s.length(); i++) {
    String prefix = s.substring(0,i);
    if(dict.contains(prefix)){
    if(i == s.length()){
    return true;
    }
    if(wordBreak(s.substring(i),dict,map)){
    map.put(s.substring(i), true);
    return true;
    }
    }
    }
    map.put(s, false);
    return false;
    }

  40. Milan on 2016-9-4

    slight improvement on solution 3, use boolean, instead in int to avoid confusion.

    public boolean wordBreak(String s, Set wordDict) {
    boolean[] pos = new boolean[s.length()+1];
    pos[0] = true;
    for(int i = 0; i < s.length(); i++){
    for(int j = i+1; pos[i] && j <= s.length(); j++){
    if (wordDict.contains(s.substring(i, j)))
    pos[j] = true;
    }
    }
    return pos[s.length()];
    }

  41. Nicole on 2016-10-27

    if Approach 1 and Approach 3 both are O(N^2), why is 3 so much better than one?

  42. Rana on 2017-8-10

    The brute force solution seems to be wrong?

    Input: s = soybean ; dict = {“so”, “y”, “bean”}
    Output: NO

    But output should be YES

  43. Newbie on 2017-8-25

    Why do I think that the first solution is the most efficient? The remaining two solutions loop through each char in string s, while the first one did not. When you call dict.contains() in solution 3, I think below the surface the dictionary is looped through too.
    The naive approach is actually the best, isn’t it? This approach does not loop string s from 0 to s.length-1

Leave a comment

*