172. Design Backup System for a File Storage Service

Asked in

Design Backup System for a File Storage Service
You are asked to design a backup system for a file storage service.

The system stores files, supports reading and writing files, and can create backups of the current file system state.

A backup can be one of three types: FULL, DIFFERENTIAL, or LOG.

A FULL backup stores the complete current state of all existing files. A DIFFERENTIAL backup stores only the final file changes made since the latest full backup. A LOG backup stores the ordered write operations performed after the latest backup pointer.

The system must support creating backups using caller-provided backup ids, reading the exact backup data stored for a file, and restoring backups.

There is no delete file operation, and no backup data format contains delete records. The problem guarantees that restoreBackup is called only when a successful restore does not require deleting any file from the current file system.

Methods

BackupSystem()
  • Creates a new empty backup system with no files and no backups.
void writeFile(String fileName, String content)
  • Creates or updates fileName with content.
  • If the file already exists, its old content is replaced.
  • This operation may later be stored inside a LOG backup.
String readFile(String fileName)
  • Returns the current content of the file.
  • Returns an empty string if the file does not exist.
  • This method does not change the file system and is never stored inside a LOG backup.
boolean createBackup(int backupId, String backupType)
  • backupId is provided by the caller and may be given in any order.
  • backupType must be FULL, DIFFERENTIAL, or LOG.
  • Returns true if the backup is created successfully.
  • Returns false if backupId already exists or backupType is invalid.
  • A DIFFERENTIAL or LOG backup can be created only after at least one successful FULL backup exists.
  • If this method returns false, no state changes.
  • Backup data is fixed at creation time and never changes later.
  • After a successful creation, the latest backup pointer becomes the newly created backup.
  • The latest backup is based on operation order, not the largest backupId.
  • The latest full backup is updated only when a new FULL backup is successfully created.

FULL Backup Creation Rules

A FULL backup stores all files that exist at the time of backup creation. It can be created even when the file system is empty.

DIFFERENTIAL Backup Creation Rules

A DIFFERENTIAL backup compares the current file system state with the latest successful FULL backup. It stores UPSERT,fileName,content only for files whose final current content is different from the latest full backup, including files created after that full backup.

If a file was written multiple times, only its final content at the time of differential backup creation matters. If the final content is the same as in the latest full backup, nothing is stored for that file.

Its base full backup is fixed at creation time. Later full backups do not change the base of an existing differential backup. To restore it, restore its base full backup first, then apply the differential changes.

LOG Backup Creation Rules

A LOG backup stores all ordered write operations performed after the latest backup pointer. Its previous backup dependency is fixed as the latest backup pointer at creation time.

To restore it, restore its previous backup first, then replay the stored write operations in order. If the previous backup is another LOG or DIFFERENTIAL backup, restore dependencies recursively.
String readBackupData(int backupId, String backupType, String fileName)
  • Returns the exact stored backup data for fileName if the given backup directly stores data for that file.
  • Returns an empty string if backupId does not exist.
  • Returns an empty string if backupType is invalid or does not match the actual type of the backup.
  • Returns an empty string if the backup is not active on the given file.
  • A backup is active on a file only if that backup directly stores data for that file.
  • This method does not change the current file system.
  • A backup that cannot be restored can still be read using this method.

Backup Data Format

readBackupData must return data using the exact formats below. No extra spaces should be added before or after separators. If content is empty, the returned data still contains the trailing comma after fileName.

Example:
FILE,a.txt,

FULL Backup Data Format

If a FULL backup stores a file, return:

FILE,fileName,content

Example:
FILE,a.txt,A1

DIFFERENTIAL Backup Data Format

If a DIFFERENTIAL backup stores a created or updated file, return:

UPSERT,fileName,content

Example:
UPSERT,a.txt,A2

The differential backup stores only final content differences from its base full backup.

LOG Backup Data Format

Each write operation stored in a LOG backup is represented as:

WRITE,fileName,content

If the same file has multiple write operations in the same log backup, return all of them in original operation order separated by |.

Example:
WRITE,a.txt,A2|WRITE,a.txt,A3
boolean restoreBackup(int backupId, String backupType)
  • Returns true if restore is successful.
  • Returns false if backupId does not exist.
  • Returns false if backupType is invalid or does not match the actual backup type.
  • Returns false if the requested backup was created before the latest successfully created FULL backup.
  • A backup can be restored only if it is the latest successful FULL backup or was created after it.
  • Older backups may still be read using readBackupData, but they cannot be restored.
  • After a successful restore, the current file system must exactly match the state represented by that backup.
  • Restore may update file contents and recreate missing files, but test cases never require deleting files.
  • After a successful restore, the latest backup pointer becomes the restored backup.
  • Calling restore does not change the latest full backup.
  • The restore operation itself is not stored as a future LOG write operation.
  • After a successful restore, the pending write-operation log is cleared.
  • If restore returns false, no state changes.

Constraints

  • 1 ≤ fileName.length() ≤ 100
  • 0 ≤ content.length() ≤ 1000
  • 1 ≤ number of method calls ≤ 100,000
  • 1 ≤ backupId ≤ 100,000
  • backupId may be provided in any order.
  • backupType is a non-empty string and may be valid or invalid.
  • File names contain lowercase English letters, digits, dots, and underscores.
  • File names do not contain comma , or pipe | characters.
  • Content may contain lowercase English letters, uppercase English letters, digits, spaces, dots, and underscores.
  • Content does not contain comma , or pipe | characters.

Examples

Example 1: Full Backup Data

BackupSystem system = new BackupSystem()
system.writeFile(fileName = "a.txt", content = "A1")
system.writeFile(fileName = "b.txt", content = "B1")
system.createBackup(backupId = 1, backupType = "FULL")
Output: true
system.readBackupData(backupId = 1, backupType = "FULL", fileName = "a.txt")
Output: "FILE,a.txt,A1"
system.readBackupData(backupId = 1, backupType = "FULL", fileName = "c.txt")
Output: ""
system.writeFile(fileName = "a.txt", content = "A2")
system.restoreBackup(backupId = 1, backupType = "FULL")
Output: true
system.readFile(fileName = "a.txt")
Output: "A1"

Explanation

Backup 1 is a FULL backup. It stores a.txt as FILE,a.txt,A1. It does not store c.txt, so reading backup data for c.txt returns an empty string. After restoring backup 1, a.txt goes back from A2 to A1.

Example 2: Differential Backup Data

BackupSystem system = new BackupSystem()
system.writeFile(fileName = "a.txt", content = "A1")
system.writeFile(fileName = "b.txt", content = "B1")
system.createBackup(backupId = 10, backupType = "FULL")
Output: true
system.writeFile(fileName = "a.txt", content = "A2")
system.writeFile(fileName = "c.txt", content = "C1")
system.createBackup(backupId = 3, backupType = "DIFFERENTIAL")
Output: true
system.readBackupData(backupId = 3, backupType = "DIFFERENTIAL", fileName = "a.txt")
Output: "UPSERT,a.txt,A2"
system.readBackupData(backupId = 3, backupType = "DIFFERENTIAL", fileName = "b.txt")
Output: ""
system.readBackupData(backupId = 3, backupType = "DIFFERENTIAL", fileName = "c.txt")
Output: "UPSERT,c.txt,C1"
system.restoreBackup(backupId = 3, backupType = "DIFFERENTIAL")
Output: true
system.readFile(fileName = "a.txt")
Output: "A2"
system.readFile(fileName = "b.txt")
Output: "B1"
system.readFile(fileName = "c.txt")
Output: "C1"

Explanation

Backup 3 is based on full backup 10. File a.txt changed from A1 to A2, and c.txt was created after the full backup. File b.txt did not change, so the differential backup is not active on it. Backup ids can be provided in any order.

Example 3: Log Backup Data

BackupSystem system = new BackupSystem()
system.writeFile(fileName = "a.txt", content = "A1")
system.createBackup(backupId = 5, backupType = "FULL")
Output: true
system.writeFile(fileName = "a.txt", content = "A2")
system.writeFile(fileName = "a.txt", content = "A3")
system.writeFile(fileName = "b.txt", content = "B1")
system.createBackup(backupId = 2, backupType = "LOG")
Output: true
system.readBackupData(backupId = 2, backupType = "LOG", fileName = "a.txt")
Output: "WRITE,a.txt,A2|WRITE,a.txt,A3"
system.readBackupData(backupId = 2, backupType = "LOG", fileName = "b.txt")
Output: "WRITE,b.txt,B1"
system.readBackupData(backupId = 2, backupType = "LOG", fileName = "c.txt")
Output: ""
system.restoreBackup(backupId = 2, backupType = "LOG")
Output: true
system.readFile(fileName = "a.txt")
Output: "A3"
system.readFile(fileName = "b.txt")
Output: "B1"

Explanation

Backup 2 stores write operations performed after backup 5. File a.txt has two stored operations, so both are returned in order separated by |. File c.txt has no stored operation in backup 2. Backup id 2 is the latest backup after it is created because latest means operation order, not largest id.

Example 4: Older Backup Cannot Be Restored

BackupSystem system = new BackupSystem()
system.writeFile(fileName = "a.txt", content = "A1")
system.createBackup(backupId = 1, backupType = "FULL")
Output: true
system.writeFile(fileName = "b.txt", content = "B1")
system.createBackup(backupId = 2, backupType = "FULL")
Output: true
system.restoreBackup(backupId = 1, backupType = "FULL")
Output: false
system.readBackupData(backupId = 1, backupType = "FULL", fileName = "a.txt")
Output: "FILE,a.txt,A1"
system.restoreBackup(backupId = 2, backupType = "FULL")
Output: true

Explanation

Backup 1 was created before the latest successful full backup, backup 2, so it cannot be restored. It can still be read using readBackupData. Backup 2 can be restored because it is the latest successful full backup.




Please use Laptop/Desktop or any other large screen to add/edit code.