136. Compress String using Prefix Pattern
Compress String using Substring Pattern
Given a string s, compress it by choosing one non-empty substring p of s and using the symbol * to represent repeated occurrences of that same substring, so that the final encoded string is as short as possible.
Every * must represent exactly the same fixed substring p.
The chosen substring p must appear earlier in the encoded string as one contiguous raw segment written using only letters. It cannot be introduced or extended using previous * expansions.
Raw letters may appear before, between, or after * symbols.
After replacing every * with p, the decoded string must be exactly equal to s.
Return the compressed string.
If multiple compressed strings have the same minimum length, return the lexicographically smallest one using normal string lexicographic order.
Method Signature
String compressString(String s)
- Returns the minimum-length encoded string for
s.
- If more than one valid encoding has the same minimum length, returns the lexicographically smallest encoding.
Constraints
1 ≤ s.length() ≤ 200
s contains only uppercase English letters.
Examples
Example 1
Input: compressString(s = "ABCABCEABCABCE"), Output: "ABC*E**E"
Explanation: Choose p = "ABC". The first "ABC" is written as raw characters. Then each * represents "ABC", so "ABC*E**E" decodes to "ABCABCEABCABCE". This is lexicographically smaller than "ABCABCE*".
Example 2
Input: compressString(s = "AAAAAA"), Output: "AA**"
Explanation: One valid minimum-length encoding is "AAA*", where * represents "AAA". Another valid minimum-length encoding is "AA**", where each * represents "AA". Both have the same length, and "AA**" is lexicographically smaller.
Example 3
Input: compressString(s = "ABCD"), Output: "ABCD"
Explanation: No valid use of * can make the encoding shorter, so the original string is already the minimum-length encoding.
Example 4
Input: compressString(s = "XABCDABCDY"), Output: "XABCD*Y"
Explanation: Choose p = "ABCD". The substring "ABCD" is not a prefix of s, but it appears earlier as a raw contiguous segment in the encoding, so the later occurrence can be replaced with *.