Survey Management Service LLD

The service is focused on building an in-memory search engine for a blog platform, where content is categorized, documents can be searched, and results can be ordered.

Jan 14, 2025

Question

LLD of survey management service with proper working code in java (in local intellij, had to run few manual test cases as well)
Your organization has started a new tech blog with interesting tech stories and you’re responsible for
designing and implementing an in-memory search engine, supporting the search functionality on the blog content.


Requirements:
It should be possible to create a category in the search engine.
It should be possible to insert and delete documents in a given category.
It should be possible to search through documents for a search pattern in a given category. (Normal,FRequencyBased)
It should be possible to order the search results(Created,Updates)

The given problem statement involves creating an in-memory search engine for a blog platform. The design and implementation need to address various aspects of content categorization, document management, search efficiency, and result ordering. Here’s a theoretical breakdown of the problem and its solution:

Problem Analysis

Core Problem:
- Enable a robust in-memory search engine that supports categorization, addition/removal of documents, and efficient search functionality for blog content.
Key Requirements:
- Ability to create, update, and delete categories (logical groups for blog posts).
- Ability to add and remove documents (individual blog posts) within categories.
- Perform search operations within a category using:
  - A Normal Search for pattern matching.
  - A Frequency-Based Search for ranking documents by term frequency.
- Provide the ability to sort search results based on:
  - Document creation date.
  - Last update date.
Constraints:
- In-memory storage: All data resides in memory, so persistence is not a concern for this system.
- Performance is crucial, as search operations must be fast and efficient for better user experience.
Challenges:
- Efficiently storing and organizing documents.
- Handling searches on potentially large sets of documents in real-time.
- Implementing multiple search strategies with flexibility.
- Sorting results dynamically based on user preferences.

Key Design Considerations

Data Structures:
- Map for Categories: Use a Map<String, Category> to store categories. Each category name serves as the key, and the value is a Category object containing its documents.
- List for Documents: Each Category contains a List<Document> to store associated blog posts.
Search Flexibility:
- Strategy Pattern: Implement search strategies (Normal, Frequency-Based) using the Strategy Design Pattern. This makes it easy to add new search algorithms in the future without modifying existing code.
Sorting Results:
- Use Java's Comparator for dynamic sorting of search results based on creation date, update date, or other criteria.
Separation of Concerns:
- Separate the management of categories/documents from the search logic. This ensures modular and maintainable code.
Scalability:
- While the current implementation is in-memory, the design should be extensible to support distributed systems or persistent storage in the future.

Logical Workflow

Category Management:
- A category is a logical grouping of blog posts (documents). It is created, updated, and deleted using the SearchEngine class.
Document Management:
- Documents can be added or removed from a category.
- Each document has metadata (e.g., title, content, creation time, update time) used for search and sorting.
Search Functionality:
- When a user initiates a search:
  - Select the appropriate category.
  - Apply the chosen search strategy (Normal or Frequency-Based).
  - Return results based on the matching criteria.
Sorting Results:
- The search results can be dynamically sorted using Comparator. Users can choose the sorting method (e.g., by creation date or update date).

Key Design Patterns Used

Strategy Design Pattern:
- Enables flexibility in implementing different search strategies (e.g., Normal Search, Frequency-Based Search) without changing the core search engine logic.
Factory Design Pattern (optional):
- Could be used to instantiate the appropriate search strategy dynamically.
Data Access Abstraction:
- Encapsulates the operations for categories and documents, allowing for easy maintenance and extension.

Advantages of the Design

Modular and Extensible:
- The system is designed to easily add new features, such as:
  - Advanced search strategies.
  - Persistent storage for documents.
  - Integration with external tools like Elasticsearch.
Efficient Search:
- The design supports efficient in-memory search, with flexibility for future performance optimization.
Dynamic Sorting:
- Users can customize the ordering of search results without modifying the core search logic.
Ease of Testing:
- Each component (e.g., search strategies, category management) can be tested independently.

Future Enhancements

Pagination:
- Add support for paginated search results to handle large datasets.
Search Ranking:
- Implement ranking algorithms to prioritize results based on relevance.
Distributed Systems:
- Scale the design for distributed environments to handle a higher volume of data and search requests.
Integration with Persistent Storage:
- Extend the in-memory search engine to integrate with databases or distributed storage systems.
Full-Text Search:
- Incorporate advanced full-text search capabilities using tools like Lucene or Elasticsearch.

Key Features to Consider

Category Management:
- Create, delete, and manage categories.
- Each category will contain multiple documents.
Document Management:
- Insert and delete documents within a category.
- Each document will have a title, content, creation date, update date, and metadata for search.
Search Functionality:
- Support search for a pattern in a specific category.
- Provide two search modes:
  - Normal Search: Find documents containing the search pattern.
  - Frequency-Based Search: Return results based on the frequency of the search term in the document.
Ordering Search Results:
- Allow ordering by:
  - Creation Date
  - Last Update Date

Proposed Classes and Interfaces

1. Category

Represents a category in the search engine.

2. Document

Represents a blog post or document in a category.

3. SearchEngine

Main entry point to manage categories, documents, and search functionality.

4. SearchStrategy

Interface for search strategies (Normal, Frequency-Based).

5. NormalSearchStrategy

Implements a basic search for documents containing a pattern.

6. FrequencySearchStrategy

Implements search based on the frequency of a search term.

7. SearchResult

Represents a single search result with sorting capabilities.

Class Diagram

Here’s an overview of the classes and their relationships:

SearchEngine
    - Map<String, Category>

Category
    - String name
    - List<Document>

Document
    - String title
    - String content
    - LocalDateTime createdAt
    - LocalDateTime updatedAt

SearchStrategy (interface)
    + List<SearchResult> search(String pattern, List<Document> documents)

NormalSearchStrategy
    + search(pattern, documents)

FrequencySearchStrategy
    + search(pattern, documents)

SearchResult
    - Document document
    - int frequency (optional)

Document Class

import java.time.LocalDateTime;

public class Document {
    private String title;
    private String content;
    private LocalDateTime createdAt;
    private LocalDateTime updatedAt;

    public Document(String title, String content) {
        this.title = title;
        this.content = content;
        this.createdAt = LocalDateTime.now();
        this.updatedAt = LocalDateTime.now();
    }

    public String getTitle() {
        return title;
    }

    public String getContent() {
        return content;
    }

    public LocalDateTime getCreatedAt() {
        return createdAt;
    }

    public LocalDateTime getUpdatedAt() {
        return updatedAt;
    }

    public void updateContent(String newContent) {
        this.content = newContent;
        this.updatedAt = LocalDateTime.now();
    }
}

2. Category Class

import java.util.ArrayList;
import java.util.List;

public class Category {
    private String name;
    private List<Document> documents;

    public Category(String name) {
        this.name = name;
        this.documents = new ArrayList<>();
    }

    public String getName() {
        return name;
    }

    public List<Document> getDocuments() {
        return documents;
    }

    public void addDocument(Document document) {
        documents.add(document);
    }

    public void removeDocument(Document document) {
        documents.remove(document);
    }
}

3. SearchStrategy Interface

import java.util.List;

public interface SearchStrategy {
    List<SearchResult> search(String pattern, List<Document> documents);
}

4. NormalSearchStrategy Class

import java.util.ArrayList;
import java.util.List;

public class NormalSearchStrategy implements SearchStrategy {
    @Override
    public List<SearchResult> search(String pattern, List<Document> documents) {
        List<SearchResult> results = new ArrayList<>();
        for (Document document : documents) {
            if (document.getContent().contains(pattern)) {
                results.add(new SearchResult(document, 0));
            }
        }
        return results;
    }
}

5. FrequencySearchStrategy Class

import java.util.ArrayList;
import java.util.List;

public class FrequencySearchStrategy implements SearchStrategy {
    @Override
    public List<SearchResult> search(String pattern, List<Document> documents) {
        List<SearchResult> results = new ArrayList<>();
        for (Document document : documents) {
            int frequency = calculateFrequency(document.getContent(), pattern);
            if (frequency > 0) {
                results.add(new SearchResult(document, frequency));
            }
        }
        return results;
    }

    private int calculateFrequency(String content, String pattern) {
        int count = 0;
        int index = content.indexOf(pattern);
        while (index != -1) {
            count++;
            index = content.indexOf(pattern, index + pattern.length());
        }
        return count;
    }
}

6. SearchResult Class

public class SearchResult {
    private Document document;
    private int frequency;

    public SearchResult(Document document, int frequency) {
        this.document = document;
        this.frequency = frequency;
    }

    public Document getDocument() {
        return document;
    }

    public int getFrequency() {
        return frequency;
    }
}

7.SearchEngine

import java.util.*;

public class SearchEngine {
    private Map<String, Category> categories;

    public SearchEngine() {
        this.categories = new HashMap<>();
    }

    public void createCategory(String name) {
        categories.put(name, new Category(name));
    }

    public void deleteCategory(String name) {
        categories.remove(name);
    }

    public void addDocument(String categoryName, Document document) {
        Category category = categories.get(categoryName);
        if (category != null) {
            category.addDocument(document);
        }
    }

    public void deleteDocument(String categoryName, Document document) {
        Category category = categories.get(categoryName);
        if (category != null) {
            category.removeDocument(document);
        }
    }

    public List<SearchResult> search(String categoryName, String pattern, SearchStrategy strategy, Comparator<SearchResult> comparator) {
        Category category = categories.get(categoryName);
        if (category == null) {
            return Collections.emptyList();
        }
        List<SearchResult> results = strategy.search(pattern, category.getDocuments());
        results.sort(comparator);
        return results;
    }
}

Github Code link for above design

Shashank’s Substack

Discussion about this post