Leetcode – Word Break (Java)

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

For example, given
s = “leetcode”,
dict = [“leet”, “code”].

Return true because “leetcode” can be segmented as “leet code”.

1. Naive Approach

This problem can be solve by using a naive approach, which is trivial. A discussion can always start from that though.

public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
             return wordBreakHelper(s, dict, 0);
    }
 
    public boolean wordBreakHelper(String s, Set<String> dict, int start){
        if(start == s.length()) 
            return true;
 
        for(String a: dict){
            int len = a.length();
            int end = start+len;
 
            //end index should be <= string length
            if(end > s.length()) 
                continue;
 
            if(s.substring(start, start+len).equals(a))
                if(wordBreakHelper(s, dict, start+len))
                    return true;
        }
 
        return false;
    }
}

Time is O(n^2) and exceeds the time limit.

2. Dynamic Programming

The key to solve this problem by using dynamic programming approach:

  • Define an array t[] such that t[i]==true => 0-(i-1) can be segmented using dictionary
  • Initial state t[0] == true
public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
        boolean[] t = new boolean[s.length()+1];
        t[0] = true; //set first to be true, why?
        //Because we need initial state
 
        for(int i=0; i<s.length(); i++){
            //should continue from match position
            if(!t[i]) 
                continue;
 
            for(String a: dict){
                int len = a.length();
                int end = i + len;
                if(end > s.length())
                    continue;
 
                if(t[end]) continue;
 
                if(s.substring(i, end).equals(a)){
                    t[end] = true;
                }
            }
        }
 
        return t[s.length()];
    }
}

Time: O(string length * dict size).

3. Java Solution 3 – Simple and Efficient

In Solution 2, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).

public boolean wordBreak(String s, Set<String> wordDict) {
    int[] pos = new int[s.length()+1];
 
    Arrays.fill(pos, -1);
 
    pos[0]=0;
 
    for(int i=0; i<s.length(); i++){
        if(pos[i]!=-1){
            for(int j=i+1; j<=s.length(); j++){
                String sub = s.substring(i, j);
                if(wordDict.contains(sub)){
                    pos[j]=i;
                }
            } 
        }
    }
 
    return pos[s.length()]!=-1;
}

4. The More Interesting Problem

The dynamic solution can tell us whether the string can be broken to words, but can not tell us what words the string is broken to. So how to get those words?

Check out Word Break II.

52 thoughts on “Leetcode – Word Break (Java)”

  1. It works OK for example given in the article but can be easily broken:
    s = “leetcode”,
    dict = [“lee”,”leet”, “code”].

  2. I like this problem… it is so simple, but a nice exercise.
    Here is a version using a stack instead of recursion (just for fun), however the complexity is O(n^2)… not acceptable.


    // Brute-force:
    // T(n) = O(n^2)
    private boolean wordBreak_(String s, Set dict) {
    if (s == null || s.length() == 0)
    return false;

    Stack stack = new Stack();
    int n = s.length();

    stack.push(0);
    while (!stack.empty()) {
    int start = stack.pop();
    if (start == n)
    return true;

    for (String a:dict) {
    int len = a.length();
    if (start + len > n)
    continue;

    if (s.substring(start, start+len).equals(a))
    stack.push(start + len);
    }
    }

    return false;
    }

  3. I like this problem… it is so simple, but a nice exercise.
    Here is a version using a stack instead of recursion (just for fun), however the complexity is O(n^2)… not acceptable.

    public boolean wordBreak_(String s, Set dict) {
    if (s == null || s.length() == 0)
    return false;

    Stack stack = new Stack();
    int n = s.length();

    stack.push(0);
    while (!stack.empty()) {
    int start = stack.pop();
    for (String a:dict) {
    int len = a.length();
    if (start + len > n)
    continue;

    boolean isSame = s.substring(start, start+len).equals(a);
    if (isSame && start + len == n)
    return true;

    if (isSame)
    stack.push(start + len);
    }
    }
    return false;
    }

  4. The description is not complete :
    1) not all of the words in the dictionary have to be used
    2) you can use a word for dictionary multiple times
    3) words in dictionary can be substrings of other words in dictionary


  5. Java one loop solution. This is the shortest I have seen here and probably the most efficient. See my code below :)

    public static boolean WordBreak(String str, List words){
    StringBuilder sb = new StringBuilder(str);
    int start=0, end=str.length(), counter=1;

    while(counter <= end){
    if(words.contains(sbOne.substring(start,counter))){
    sbOne.delete(start,counter);
    counter=1;
    end=sbOne.length();
    }
    counter++;
    }
    return (sbOne.length() == 0) ? true : false;
    }

  6. Solution 2 is exhaustively iterating through the word dictionary, which is problematic (as is use of a HashSet). Using a a SortedSet, you can constrain the words you compare against via set.subSet over [minimal substring of input string, max substring of input string].

  7. Why do I think that the first solution is the most efficient? The remaining two solutions loop through each char in string s, while the first one did not. When you call dict.contains() in solution 3, I think below the surface the dictionary is looped through too.
    The naive approach is actually the best, isn’t it? This approach does not loop string s from 0 to s.length-1

  8. The brute force solution seems to be wrong?

    Input: s = soybean ; dict = {“so”, “y”, “bean”}
    Output: NO

    But output should be YES

  9. slight improvement on solution 3, use boolean, instead in int to avoid confusion.

    public boolean wordBreak(String s, Set wordDict) {
    boolean[] pos = new boolean[s.length()+1];
    pos[0] = true;
    for(int i = 0; i < s.length(); i++){
    for(int j = i+1; pos[i] && j <= s.length(); j++){
    if (wordDict.contains(s.substring(i, j)))
    pos[j] = true;
    }
    }
    return pos[s.length()];
    }


  10. public boolean wordBreak(String s,Set dict){
    HashMap map = new HashMap();
    return wordBreak(s, dict,map);
    }

    public boolean wordBreak(String s,Set dict,HashMap map){

    if(map.containsKey(s)){
    return map.get(s);
    }

    for (int i = 0; i <= s.length(); i++) {
    String prefix = s.substring(0,i);
    if(dict.contains(prefix)){
    if(i == s.length()){
    return true;
    }
    if(wordBreak(s.substring(i),dict,map)){
    map.put(s.substring(i), true);
    return true;
    }
    }
    }
    map.put(s, false);
    return false;
    }

  11. This implementation looks neat. But the complexity is exponential so I would choose polynomial implementation for my case.

  12. Yes, i had commented that a Trie were a better solution – before seeing you have already posted it.

  13. The dynamic solution fails for the case
    s = “aaaab”
    dict = [“a”, “aa”, “ac”]

    It returns true but should return false.

  14. how about this?

    import java.util.*;

    public class WordBreak {
    static Set dictionary = new HashSet();
    public static void main (String[] args) {
    initializeDictionary();
    System.out.println(wordBreak(“leetcodesamsung”));
    }

    private static Boolean wordBreak(String s) {
    Boolean[] memo = new Boolean[s.length()];
    Arrays.fill(memo, Boolean.FALSE);
    int startIndex = 0;
    for (int i = 0; i < s.length(); i++) {
    int endIndex = i+1;
    if (!memo[i] && dictionary.contains(s.substring(startIndex,endIndex)))
    {
    memo[i] = true;
    startIndex = endIndex;
    }
    }

    return memo[s.length() – 1];
    }

    private static void initializeDictionary() {
    dictionary.add("le");
    dictionary.add("et");
    dictionary.add("code");
    dictionary.add("samsun");
    dictionary.add("g");
    }
    }

  15. Yes, the wordBreak(s.substring(i), dict) should put in the if condition. However, my solution cannot pass the latched online judge.

  16. I think instead of returning wordBreak(s.substring(i), dict) you need to have that in the if statement with dict.contains(sstr). Otherwise your function returns to early in some cases

    if (dict.contains(sstr) && wordBreak(s.substring(i), dict)) return true;

  17. Your solution does not pass leetcode online judge.

    Input: “goalspecial”, [“go”,”goal”,”goals”,”special”]
    Output: false
    Expected: true

  18. Very short Python solution, also using trie:

    class WordSplitTreeNode:

    def __init__(self):

    self.children = [None for i in range(ord(“z”) – ord(“a”) + 1)]

    self.final = False

    class WordSplitTree:

    def __init__(self):

    self.root = WordSplitTreeNode()

    def addWord(self, word):

    node = self.root

    for c in word:

    i = ord(c) – ord(“a”)

    if node.children[i] is None:

    node.children[i] = WordSplitTreeNode()

    node = node.children[i]

    node.final = True

    def iterate(self, word, pos, node):

    if node.final and pos == len(word):

    yield “”

    return

    nextNodes = []

    i = ord(word[pos]) – ord(“a”)

    if node.final:

    nextNodes.append(self.root.children[i])

    nextNodes.append(node.children[i])

    for nextNode in nextNodes:

    if not nextNode is None:

    for s in self.iterate(word, pos + 1, nextNode):

    yield (” ” if node.final and nextNode is self.root.children[i] else “”) + word[pos] + s

    def checkWordCanBeSplit(word, dictWords):

    tree = WordSplitTree()

    for w in dictWords:

    tree.addWord(w)

    return [splitWord for splitWord in tree.iterate(word, 0, tree.root)]

  19. The problem is supposed to be equivalent to matching the regexp (leet|code)*, which means that it can be solved by building a DFA in O(2^m) and executing it in O(n)

  20. Because you skipped the last IF condition if t[end] is already true, now all possibilities are not given.

    For INPUT: “leetcode”, [“leetcode”,”leet”,”code”].

    matches are : leetcode leet

    code is not given as t[end] is made true by match leetcode.

  21. /* package whatever; // don’t place package name! */

    import java.util.*;

    import java.lang.*;

    import java.io.*;

    class Wordbreak

    {

    public static boolean wordBreak(String s, String[] dict){

    int counter = 0;

    System.out.println(“Given string to beak :”+ s);

    for(String ds : dict){

    //System.out.println(“dict string :”+ ds);

    if(s.contains(ds)){

    counter = counter +1;

    //System.out.println(“Index of :”+ ds +” ” + s.indexOf(ds));

    int strtindex = s.indexOf(ds);

    int len = ds.length();

    String sb = s.substring( strtindex, strtindex+len);

    //System.out.println(“Am der “+ sb);

    //System.out.println(“dict string :”+ ds);

    }

    }

    if (counter>0){

    return true;

    }

    return false;

    }

    public static void main (String[] args) throws java.lang.Exception

    {

    String s = “programcreek”;

    String[] dict = new String[]{“programcree”,”program”,”creek”};

    boolean b = wordBreak(s, dict);

    if(b){

    System.out.println(“Can be done”);

    }

    else{

    System.out.println(“Not possible!!”);

    }

    }

    }

  22. {

    if (s == null || s.length() == 0 || dict.isEmpty()) {

    return false;

    }

    for (String w : dict) {

    if (s.equals(w)) {

    return true;

    } else if (s.startsWith(w)) {

    String newS = new String(s);

    do {

    newS = newS.replace(w, “”);

    } while (newS.startsWith(w));

    if (newS.equals(“”)) {

    return true;

    }

    boolean result = this.wordBreak(newS, dict);

    if (result) {

    return true;

    }

    }

    }

    return false;

    }

  23. Another solution , O(n^3), being n the length() of s. I assume that the set is a hashSet.

    public boolean wordBreak(String s, Set dict) {

    if(s == null || s.length() == 0){

    return true;

    }

    boolean[] arr = new boolean[s.length()];

    if(dict.contains(s.charAt(0)+””)){

    arr[0] = true;

    }

    for(int i = 1;i<arr.length;i++){

    if (dict.contains(s.substring(0, i + 1))) {

    arr[i] = true;

    }

    for (int j = 0; j < i; j++) {

    int a = i – j + 1;

    String subWord = s.substring(j + 1, j + a);

    if (dict.contains(subWord) && arr[j]) {

    arr[i] = true;

    j = i;

    break;

    }

    }

    }

    return arr[arr.length-1];

    }

  24. One of the questions will be: Can we use the same dictionary word more than once? For example, if we have the word “leetcodeleetcode” and the dictionary have the words {“leet”, “code”}, the result will be true?

  25. Apparetly, this algorithm is not correct.

    If you use this dictory

    dict.add(“leet”);

    dict.add(“code”);

    dict.add(“lee”);

    dict.add(“programcree”);

    dict.add(“program”);

    dict.add(“creek”);

    , then if will not recognize “leetcode”.

  26. I guess this can be solved by using Tries also.

    Time Complexity : O(n) + O(m)
    Space Complexity : O(m)

    Let me know if the following code will work for all cases.

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = ‘^’;
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }
    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }

    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = ‘A’;
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val – start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));

    current = current.children[index];
    }
    current.isLeaf = true;
    }
    }

    class Example {

    boolean wordBreak(String s, Set<String> dict) {
    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);
    }

    TrieNode current = a.root;
    int start = ‘A’;

    boolean result = false;

    for(int i=0; i<s.length();i++) {
    int val = Character.toUpperCase(s.charAt(i));
    int index = val – start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }
    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }

    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set<String> dict = new HashSet<String>();
    dict.add(“leet”);
    dict.add(“code”);
    dict.add(“programcree”);
    dict.add(“program”);
    dict.add(“creek”);
    System.out.println(“Wordbreak (programcreek) = ” + temp.wordBreak(“programcreek”, dict));
    System.out.println(“Wordbreak (leetcode) = ” + temp.wordBreak(“programcreek”, dict));
    }

    }

  27. import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = '^';
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }
    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }

    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = 'A';
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val - start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));

    current = current.children[index];
    }
    current.isLeaf = true;
    }
    }

    class Example {

    boolean wordBreak(String s, Set dict) {
    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);
    }

    TrieNode current = a.root;
    int start = 'A';

    boolean result = false;

    for(int i=0; i<s.length();i++) {
    int val = Character.toUpperCase(s.charAt(i));
    int index = val - start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }
    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }

    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set dict = new HashSet();
    dict.add("leet");
    dict.add("code");
    dict.add("programcree");
    dict.add("program");
    dict.add("creek");
    System.out.println("Wordbreak (programcreek) = " + temp.wordBreak("programcreek", dict));
    System.out.println("Wordbreak (leetcode) = " + temp.wordBreak("programcreek", dict));
    }

    }

  28. I can think of trie based solution:

    Time Complexity = O(n) + O(m)

    Space Complexity = O(m)

    Let me if this will work:

    import java.util.*;
    import java.lang.*;
    import java.io.*;

    class TrieNode {
    char val;
    boolean isRoot;
    boolean isLeaf;
    TrieNode children[]; // There can be atmost 26 children (english alphabets)
    TrieNode() {
    val = '^';
    isRoot = true;
    children = new TrieNode[26];
    initializeChildren();
    }

    TrieNode(char val) {
    this.val = val;
    isRoot = false;
    isLeaf = false;
    children = new TrieNode[26];
    initializeChildren();
    }
    void initializeChildren() {
    for(int i=0;i<children.length;i++) {
    children[i] = null;
    }
    }
    }

    class Trie {
    TrieNode root;
    Trie() {
    root = new TrieNode();
    }

    void insert(String input) {
    int start = 'A';
    TrieNode current = root;

    for(int i=0;i<input.length();i++) {
    int val = Character.toUpperCase(input.charAt(i));
    int index = val - start;

    if(current.children[index] == null)
    current.children[index] = new TrieNode(input.charAt(i));
    current = current.children[index];
    }
    current.isLeaf = true;
    }

    }

    public class Example {
    boolean wordBreak(String s, Set dict) {

    // First Construct Trie from the dictionary

    Trie a = new Trie();
    for(String i : dict) {
    a.insert(i);

    }

    TrieNode current = a.root;
    int start = 'A';
    boolean result = false;

    for(int i=0; i<s.length();i++) {

    int val = Character.toUpperCase(s.charAt(i));
    int index = val - start;

    if(current.children[index] == null) {
    // Word is not in the dictionary
    current = a.root;
    result = false;
    break;
    }

    current = current.children[index];
    if(current.isLeaf == true) {
    // Start from the beginning for the next character
    current = a.root;
    result = true;
    }
    }
    return result;

    }

    public static void main(String[] args) {
    Example temp = new Example() ;
    Set dict = new HashSet();
    dict.add("leet");
    dict.add("code");
    dict.add("programcree");
    dict.add("program");
    dict.add("creek");

    System.out.println("Wordbreak (programcreek) = " + temp.wordBreak("programcreek", dict));
    System.out.println("Wordbreak (leetcode) = " + temp.wordBreak("leetcode", dict));

    System.out.println("Wordbreak (lesscode) = " + temp.wordBreak("lesscode", dict));

    }

    }

  29. Small correction of complexity in the 2nd case.

    It’s mentioned that “Time: O(string length * dict size)” but you also run equals (and substring is not constant for Java > 1.6) for every word in dictionary so it’s more like O(string length * dict size * length of the longest word in dict).

  30. Recursion with hashMap:

    class Solution:

    def __init__(self):

    self.table = {}

    def wordBreak(self, s, dict):

    if len(s) == 0:

    return True

    if len(s) == 1:

    return s in dict

    if s in self.table:

    return self.table[s]

    isBreakable = False

    for i in range(len(s)):

    word = s[:i+1]

    if word in dict:

    subFlag = self.wordBreak(s[i+1:], dict)

    if s[i+1:] not in self.table:

    self.table[s[i+1:]] = subFlag

    isBreakable |= subFlag

    return isBreakable

  31. recursive solution

    public static boolean wordBreak(String s, Set dict){

    //input validation

    //Base case

    if(dict.contains(s))

    return true;

    else {

    for(int i = 0; i < s.length(); i++){

    String sstr = s.substring(0, i);

    if(dict.contains(sstr))

    return wordBreak(s.substring(i), dict);

    }

    }

    return false;

    }

  32. we can do it in O(n) right , assuming if Set dic is actually a Hashset, so the retrival on Hashset is always O(1)

    List arr;
    StringBuilder sb = new StringBuilder();
    int i=0;
    int wordIndex=0;
    while(i>s.length){

    if(dic.get(sb.substring(wordIndex,i) != null){
    wordIndex=i+1;
    }else{
    arr.add(sb.substring(wordIndex, i));
    }
    i++;
    }

  33. Another approach
    package test;

    import java.util.HashSet;
    import java.util.Set;

    public class WordBreak2 {

    public static boolean wordBreak(String s,Set dict) {
    if (s.length()==0) {
    return true;
    }
    for(int i=1; i<=s.length(); i++) {
    String firstWord=s.substring(0, i);
    String remaing=s.substring(i);
    if (dict.contains(firstWord) && wordBreak(remaing, dict) ) {
    System.out.print(" ");
    System.out.print(firstWord);

    return true;
    }
    }
    return false;
    }
    public static void main(String[] args) {
    Set dict=new HashSet(5);
    dict.add(“program”);
    if (wordBreak(“pprogram”, dict)) {
    System.out.println(” YES”);
    } else {
    System.out.println(” NO”);
    }

    dict=new HashSet(5);
    dict.add(“ab”);
    dict.add(“abc”);
    dict.add(“de”);
    if (wordBreak(“abcde”, dict)) {
    System.out.println(” YES”);
    } else {
    System.out.println(” NO”);
    }

    }

    }

    This is more efficient if dict is big which is usually

  34. Thanks for these solutions. Just starting to go through the problems but looks like very useful website.

    As for how to get the words that the string breaks up to:
    Change the “t” array to integer instead of boolean.

    Replacing setting t[end] to true (i.e. saying you have found a break up of 0..end substring] with setting t[end] to i, thus saying you have found a break up of 0..end substring and the last word in that break up is substring i..end of the main string.

    Then at the end if I can break up the string, I check t[s.length()]. THe last word in the break up will substring starting at t[s.length()] and ending at s.length()-1. And you repeate this procedure to get the other words.

Leave a Comment