The Characters Analyzer offers a comprehensive breakdown of all individual characters used throughout the entire HelpNDoc project. It provides valuable insights for identifying unexpected or non-standard characters, optimizing for localization, ensuring consistency, and maintaining clean, high-quality content across all topics.

Overview

The Characters Analyzer displays a list of every distinct character found in the project, along with the following details:

  • Character: The actual character used in the content.
  • Unicode representation: The character’s Unicode code point (e.g., U+00A0).
  • Unicode category: The character's classification, such as Letter, Number, Punctuation, Symbol, Separator, etc.
  • Occurrences: The total number of times the character appears across all topics in the project.

This feature is especially useful for detecting:

  • Hidden or invisible characters (e.g., non-breaking spaces, zero-width spaces)
  • Characters outside the expected language set
  • Inconsistent use of punctuation, symbols, or marks

Accessing the Project's Characters Analyzer

To launch the Characters Analyzer for the entire project:

  • Go to the "Home" ribbon tab in HelpNDoc.
  • Click the "Analyze project" button in the "Project" group.
  • In the Project Analyzer window, navigate to the Characters section.

Available Actions

The "Find All" button lets you locate every instance of the selected character across the project. This is particularly helpful when cleaning up problematic characters or reviewing usage patterns. The results show where each character appears, including topic names.

Using the Filter Menu

The Filter menu enables you to narrow the list of characters based on their Unicode category, allowing you to focus on specific types of characters relevant to your editing tasks.

Available filter options:

  • Show letters only: Displays characters classified as letters (e.g., A–Z, accented characters, non-Latin scripts).
  • Show numbers only: Shows digits and numerals from all supported scripts.
  • Show punctuation only: Limits the list to punctuation marks such as commas, periods, and quotation marks.
  • Show symbols only: Displays characters like currency signs, math operators, arrows, and emoji.
  • Show marks only: Lists diacritical and combining marks (e.g., accents or modifiers).
  • Clear filter: Resets the view to show all characters used in the project.