Acrolinx Reuse User Guide Version: 4.1 Acrolinx™ 2 Copyright © 2014 Acrolinx GmbH All rights reserved The software contains proprietary information of Acrolinx GmbH. It is provided under a license agreement containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the software is prohibited. Due to continued product development, this information may change without notice. The information and intellectual property contained in this document is confidential between Acrolinx GmbH and the customer, and remains the exclusive property of Acrolinx GmbH. If you find any errors in the documentation, please report them to us in writing. Acrolinx GmbH does not guarantee that this document is error-free. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of Acrolinx GmbH. Acrolinx® is registered in the U.S. Patent and Trademark Office. Acrolinx™ is a trademark of Acrolinx GmbH. All trademarked and copyrighted names used within this and supplemental documents are the sole and exclusive property of their registered or common law owners. Acrolinx GmbH Friedrichstraße 100 D-10117 Berlin Germany Phone: +49 30 288 84 83-30 Fax: +49 30 288 84 83-39 E-mail: [email protected] Website: http://www.acrolinx.com DocID: IR-EN-285463-20140919-v4.1-b5065 NOTE: Because the Acrolinx user guides are updated frequently, there may be newer information in the online version of these help files. Click here to open an online version of this guide. 3 Contents Introduction 4 Acrolinx Reuse.....................................................................................................................................4 Intelligent Reuse Process......................................................................................................................4 Before You Start.......................................................................................................................6 Creating a Reuse Repository 7 How Acrolinx Identifies Sentences..........................................................................................................7 Creating and Updating Reuse Repositories with Harvested Sentences.......................................................7 Harvesting Sentences................................................................................................................7 Adding Harvested Sentences to a Repository.............................................................................10 Creating a Reuse Repository from an Import File........................................................................13 Creating Empty Reuse Repositories.....................................................................................................14 Canceling a Repository Task................................................................................................................15 Managing Clusters 16 The Clusters Page..............................................................................................................................16 Representative Sentences...................................................................................................................18 Editing Clusters.................................................................................................................................18 Sorting and Filtering the Cluster List.........................................................................................19 Changing the Cluster Status.....................................................................................................19 Changing Representative Sentences..........................................................................................20 Removing Sentences from Clusters...........................................................................................20 Adding Sentences to Clusters...................................................................................................20 Creating New Clusters.............................................................................................................21 Managing Reuse Repositories 23 Enabling Reuse Repositories for Checking.............................................................................................23 Assigning a Reuse Repository to a Rule Set...............................................................................24 Activating or Deactivating Repositories......................................................................................24 Language Server Statuses and Warnings...................................................................................24 Exporting a Reuse Repository..............................................................................................................25 Checking for Reuse Issues in the Acrolinx Plug-ins.................................................................................26 Deleting a Reuse Repository................................................................................................................26 Backing up Reuse Repositories............................................................................................................27 4 Introduction Chapter 1 Introduction Acrolinx Reuse Acrolinx Reuse is a powerful tool for maintaining consistent authoring standards and eliminating redundancy in documentation projects. Acrolinx Reuse uses linguistic analysis to match sentences based on meaning. Text from content repositories or translation memories is automatically analyzed to produce small groups of sentences with similar meaning. The preferred wording can be easily validated, selected, and released for reuse. For example, the following variations of a sentence might appear in your documentation: The following items come in your TopSpin shipment. Your TopSpin shipment includes the following items. Your TopSpin package is shipped with the following items. You can choose one these sentences to be your standard sentence, and Acrolinx identifies the variations and proposes your chosen standard sentence. When authors run a check in an Acrolinx plug-in, Acrolinx recognizes sentences with similar meanings. The author receives a suggestion with the preferred wording. All suggestions have been automatically validated by the system and released by the linguistic administrator. The author can accept the suggestion with a single mouse click. Intelligent Reuse Process The Acrolinx Reuse module is a flexible tool which you can use for a variety of purposes, such as cleaning a translation memory, or identifying new terminology. The most common use for the Acrolinx Reuse is to analyze and check consistency in a set of documentation. The following illustration 5 summarizes the steps and components that are involved in using Reuse. Acrolinx Figure 1: Standard Reuse Process Acrolinx recommends the following major steps when starting up a documentation consistency project with Acrolinx Reuse. 1 Harvest Sentences: Extract sentences from product information and store them on the Acrolinx Server. • Procedure Summary: Create a sentence bank to store sentences and run checks by using the Acrolinx plug-ins to harvest sentences from your documents. For more information, see the topic Harvesting Sentences (see "Harvesting Sentences" on page 7). 2 Create Repository: Create a repository of sentences which are grouped together based on structure and meaning. Unlike sentence banks, the sentences in a repository are grouped into clusters. When Acrolinx groups sentences together, a representative sentence is automatically selected for each cluster. • Procedure Summary: Add harvested sentences or import sentences from a text file to a new repository. For more information, see the chapter Creating a Reuse Repository (see "Creating a Reuse Repository" on page 7). 3 Enable Repository for Statistics and Checking: Enable the repository to gather statistics and provide suggestions for sentences that differ from the representative sentence. • Procedure Summary: Assign your repository to a rule set, and instruct authors to run checks with the reuse repository. For more information, see the topics Enabling Reuse Repositories for Checking (see page 23) and Checking for Reuse Issues in the Acrolinx Plug-ins (see page 26). 6 Introduction 4 (Optional) Edit Clusters: Use the statistics to review the most commonly used clusters and confirm the preferred form of the sentence for each cluster. • Procedure Summary: After you have collected enough statistics, sort clusters by match frequency, and check that the representatives for frequently used clusters are correct. For more information, see the topic Editing Clusters (see "Editing Clusters" on page 18). Before You Start To use the intelligent reuse you must have a license and appropriately configured privileges and linguistic resources. Your server administrator must install a license that is configured with the Reuse module and assign you with a role that has the privileges in the Reuse section enabled. To enable your changes for checking, you also need the privileges in the Resources section and the privilege Restart servers. Ensure that your linguistic resources are configured for Reuse. When linguistic resources are configured for reuse, the ReuseHarvesting rule set is displayed on the language server page in the Dashboard. If the ReuseHarvesting rule set is missing, contact your Acrolinx project consultant. 7 Chapter 2 Creating a Reuse Repository A reuse repository is a repository of sentences which are grouped together based on structure and meaning. • • A group of similar sentences is called a cluster. A reuse repository is normally used to store clusters which have a similar subject area or relate to a specific product. You can create a repository in the following ways: • • • Create a reuse repository from harvested sentences. (see "Adding Harvested Sentences to a Repository" on page 10) Create a reuse repository from an import file (see "Creating a Reuse Repository from an Import File" on page 13). Create an empty repository from the Repositories page (see "Creating Empty Reuse Repositories" on page 14). On the Progress page, you can monitor the progress of a repository which is being created or updated and view a list of repositories which have been completed or canceled. You can also cancel the tasks for selected repositories. How Acrolinx Identifies Sentences Depending on how your linguistic resources are configured, some characters can cause Acrolinx to interpret your sentence as two sentences. Example: Suppose you are importing a TMX file which contains a segment with the following sentence: The Topspin package always contains at least three items: a fan tray, a system controller, and a warranty card. The sentence contains a colon, and colons are interpreted by standard linguistic resources as the end of the sentence. In this example, the server interprets the segment as two separate sentences. Creating and Updating Reuse Repositories with Harvested Sentences Harvesting Sentences Sentence harvesting is the process of detecting new sentences whenever a document is checked. After you have harvested enough sentences, you can add them to a new or existing repository. The first step in harvesting sentences is to create sentence banks to store the sentences (see "Creating a Sentence Bank" on page 8). You can harvest sentences with: 8 Creating a Reuse Repository • • the Acrolinx Plug-ins (see "Harvesting Sentences with an Acrolinx Plug-in" on page 9) the Acrolinx Batch Checker (see "Harvesting Sentences with the Acrolinx Batch Checker" on page 9) Working with Sentence Banks A sentence bank stores the raw source data which you use to create a reuse repository. Unlike a reuse repository, the sentences in a sentence bank are not grouped into clusters. You can keep building up a sentence bank and experiment with different clustering settings by creating several reuse repositories from the same set of harvested sentences. You can also create several sentence banks to store sentences from different types of documentation. Creating a Sentence Bank â To create a sentence bank, follow these steps: 1 Open the page Reuse > Create and Update > Harvest Sentences and click New. 2 In the New Sentence Bank dialog box, enter a name and select a language for the sentence bank and click OK. The new sentence bank is created and is automatically selected as the default sentence bank for the language you defined. Selecting Default Sentence Banks To collect sentences in the Acrolinx Plug-ins, a default sentence bank must be defined for each checking language. â To set a default sentence bank for a checking language, follow these steps: 1 Open the page Reuse > Create and Update > Harvest Sentences. 2 In the Default column, select a sentence bank to store the harvested sentences in your required checking language. If you have sentences banks in several languages, you can select one sentence bank from each language. The changes take effect immediately. Viewing the Contents of a Sentence Bank â To view the contents of a sentence bank: • Click the name of a sentence bank on the Sentence Banks Page. You can click the column headers to sort the sentences alphabetically or by Acrolinx score. Sentences with a high Acrolinx score are lower quality sentences. TIP: If you find malformed sentences, you can use the Delete Button to remove them from the sentence bank. 9 Setting the Language for Legacy Sentence Banks Legacy sentence banks are sentence banks which were created with an Acrolinx Server version earlier than 1.5. In server versions 1.5 or later, all sentences banks must have a language defined before you can start the clustering wizard. If your installation contains legacy sentence banks, the Set Language button appears on the sentence banks page. This procedure is possible only if the Set Language button is visible. â To set the language for legacy sentence banks, follow these steps: 1 Open the page Reuse > Harvest Sentences. 2 Use the checkboxes in the Name column to select legacy sentence banks. Legacy sentence banks do not have a language defined in the Language column. 3 Click Set Language. 4 In the Set Language dialog box, select a language and click OK. 5 The selected language appears in the Language column for the selected sentence banks. If all sentence banks have a language defined after completing this procedure, the Set Language button is hidden. Harvesting Sentences with an Acrolinx Plug-in â To harvest sentences with an Acrolinx plug-in, follow these steps: 1 (Follow this step if you have not yet selected a default sentence bank) Select a default sentence bank for your required checking language. 2 In your editor application, open a document which contains the content that you intend to cluster. 3 Open the Acrolinx Plug-in Options and select a rule set which has been configured to harvest sentences. Normally this rule set is called ReuseHarvesting. If you are unsure about which rule set to use, ask your administrator. Acrolinx Server 4 Run a check with the checking options Spelling, Grammar, Style, and Terminology selected. TIP: When you run a check with the main checking options selected, you ensure that each sentence receives an accurate Acrolinx score. The Acrolinx score is used to select a cluster representative during the clustering process. 5 In the Dashboard, open the Sentence Banks Page. 6 Click Refresh to update the sentence count in the Sentence Bank Table. 7 (Optional) Click the sentence bank name to view the contents of the sentence bank. Harvesting Sentences with the Acrolinx Batch Checker â To harvest sentences with the Acrolinx Batch Checker, follow these steps: 10 Creating a Reuse Repository 1 Open the Acrolinx Batch Checker and locate the files which contain the content that you intend to cluster. 2 Configure your file settings and server connection settings (for more information, see the Acrolinx Batch Checker User Guide). 3 In the Acrolinx Batch Checker check options, select a rule set which has been configured to populate sentence banks. Normally this rule set is called ReuseHarvesting. If you are unsure about which rule set to use, ask your administrator. Acrolinx Server 4 Select your sentence bank from the Reuse Sentence Bank dropdown. NOTE: The Reuse Sentence Bank dropdown is only visible when at least one sentence bank is detected. If you have created a sentence bank and cannot see the Reuse Sentence Bank dropdown, refresh your server connection 5 Run a check with the checking options Spelling, Grammar, Style, and Terminology selected. TIP: When you run a check with the main checking options selected, you ensure that each sentence receives an accurate Acrolinx score. The Acrolinx score is used to select a cluster representative during the clustering process 6 In the Dashboard, open the Sentence Banks Page. 7 Click Refresh to update the sentence count in the Sentence Bank Table. 8 (Optional) Click the sentence bank name to view the contents of the sentence bank. Adding Harvested Sentences to a Repository After you have harvested enough sentences (see "Harvesting Sentences" on page 7), you can add them to a new or existing reuse repository. IMPORTANT: All sentence banks must have a language defined before you add the sentences to a reuse repository. If your installation contains legacy sentence banks which were created in a server version earlier than 1.5, set the language for the legacy sentence banks first (see "Setting the Language for Legacy Sentence Banks" on page 9). â To create and update reuse repositories with harvested sentences, follow these steps: 1 Open the page Reuse > Create and Update > Harvest Sentences. 2 Select one or more sentence banks using the checkboxes next to the sentence bank names. 3 Click Add to Repository. 4 On the Repository Options page, create or update a repository and click Next. When you update, you can merge or replace the contents of an existing repository. 5 On the Cluster Settings page, configure the clustering settings (see "Cluster Settings" on page 11). The cluster settings define how the sentences should be grouped together into clusters (see "Managing Clusters" on page 16). 11 • • • • The minimum word count defines the minimum number of words that must be in a sentence before the sentence can be added to a cluster. The minimum cluster size defines the minimum number of sentences that must be in a cluster before the cluster can be added to the repository. The cluster strictness defines the quality of clusters to add to the reuse repository. The initial cluster status defines the initial status of all clusters in the reuse repository. 6 Click Finish. The Acrolinx Server begins grouping the sentences into clusters and adding the clusters to the reuse repository. After the repository is created or updated, open the select the repository to view the clusters. Repositories page and If your repository is ready to be used, enable the repository for checking (see page 23). Cluster Settings You can use the cluster settings to influence the average number of sentences and the similarity of the sentences in each cluster. Acrolinx recommends that you experiment with the cluster settings, create several repositories from the same set of harvested sentences, and compare the results. Minimum Word Count The minimum word count defines the minimum number of words that must be in a sentence before the sentence can be added to a cluster. For example, titles are treated as individual sentences, but in some documents, titles often contain only one word. You can raise the minimum word count to eliminate short titles from being added to a cluster. Stop words such as "and", "to", and "the" are included in the minimum word count. To a certain extent, the lower the minimum word count, the more likely you are to get irrelevant sentences in your clusters. For example, consider the two titles "Configuring Browsers", and "Configuring Servers". If the minimum word count is set to two, both variants might be included in the same cluster. However, these sentences do not represent the same idea. Minimum Cluster Size The minimum cluster size defines the minimum number of sentences that must be in a cluster before the cluster can be added to the repository. You can use this setting to prioritize sentences which have a large degree of variation. For example, the sentence "Open the configuration file" might be written the same way in all of your documentation with only one other variant such as "Launch the configuration file". You might have many clusters that contain sentences with only one variant. These clusters can be time consuming to review and edit. 12 Creating a Reuse Repository However another sentence such as "End Date cannot be before the Start Date" might also be written in the following ways: End Date must be greater than Start Date. End Date must be greater than or equal to Start Date. End Date must be later than Start Date. End Time must be later than the Start Time. A larger number of variants in sentence structure leads to higher translation costs, so a high minimum cluster size can help you focus on the most problematic sentences. Cluster Strictness The cluster strictness defines the quality of clusters to add to the reuse repository. There are five levels of cluster strictness ranging from lowest to highest. At the lowest level, sentences which share only a few keywords are grouped. Clusters created with the lowest cluster strictness are usually large clusters which can contain ten or more sentences. For example, the following sentences are grouped with a setting of Lowest. End Date cannot be before the Start Date. End Date must be greater than Start Date. End Date must be greater than or equal to Start Date. End Date must be later than Start Date. End Time must be later than the Start Time. End date must be equal to or later than the start date. End date should be greater than start date. Please enter a start date that is before the end date. Please enter an End Date that is later than or the same as the Start Date. Please enter an end date that is later than the start date. The Start Date cannot be after the End Date. The actual end date must be on or after the actual start date. The end date cannot be before the start date. The end date must be later than or the same as the start date. At the highest level, only sentences which are very similar are grouped. Clusters created with the highest cluster strictness are usually smaller clusters with two or more sentences. For example, the following sentences are grouped with a setting of Highest. End Date must be later than Start Date. Start date must be before end date! The start date must be on or before the end date. The start date must be prior to the end date. Your start date must be before your end date. The choice of strictness depends on the type of data and on the intended purpose of the reuse repository. 13 • • A lower strictness can in result in a repository that contains lot of variation, which might be useful for testing. To reduce the degree of variation and to eliminate clusters that are too large, you can set the cluster strictness to a higher setting. The more harvested sentences you have, the more likely you need to use a higher strictness. Initial Cluster Status When you add harvested sentences to a repository you can select the initial status for all clusters. You cannot change the initial status after you create the repository. You can only change the status of clusters individually (see "Changing the Cluster Status" on page 19). To change the status for all clusters at the same time, you must create the repository again and select a different initial cluster status. You must set the clusters to Enabled if you want to make them available for checking. You must set the clusters to Proposed or Disabled if you want to edit the clusters further and do not want them to be available for checking. Creating a Reuse Repository from an Import File You can create reuse repositories by importing sentences from a text or TMX file. This feature is useful if you already have an externally validated file which contains sentences that you want to add to a new reuse repository. What You Should Know before Importing Sentences • • • When you add harvested sentences from a sentence bank, you can add new sentences to existing repositories. When adding sentences from an import file, you can add sentences only to a new repository which you create during the import process. Unlike adding harvested sentences, no linguistic intelligence is used to group sentences. You will have a cluster for every sentence in the import file. Text files must contain one sentence per line and TMX files must contain only one sentence per segment. If a line or segment contains more that one sentence, the affected line or segment is ignored and logged. If a sentence contains special characters it might be interpreted as two sentences and ignored (see "How Acrolinx Identifies Sentences" on page 7). Importing a Text or TMX File â To import a text or TMX file, follow these steps: 1 2 3 4 5 Open the page Reuse > Create and Update > Import Sentences. In the File Options, select the desired File format (Follow this step if you are importing a text file) Select the Locate the import file using the Browse button. Click Next. The Encoding. Import Preview page displays the first few rows of your import file. 6 Confirm that the preview of the import file looks correct and click Next. 14 Creating a Reuse Repository If you are importing a text file and some characters are not rendered correctly in the import preview, click Back and adjust the Encoding field. Repository Options, select the Repository language and enter the Repository name. 8 (Follow this step if you are importing a TMX file) In the TMX language 7 In the dropdown, select the language of the sentences to import. 9 Click Finish. The Import Summary page is displayed. The import begins and a progress bar displays underneath the Menu until the import operation completes. • • Navigation You can also use the Progress page to see more details on the import progress and estimated completion time. You can continue to use the Dashboard while the import is running. 10 Verify that the import was successful by viewing the import log messages (see page 14). 11 (Optional) Click Start New Import to import another file. Viewing Import Log Messages The Dashboard features a generic inbox for viewing log messages for any tasks that run in the background. This inbox is located at the top right of your screen in the Dashboard Menu. Large import tasks run in the background while you continue to use the Dashboard . When the import is complete, an unread envelope icon displays next to the Messages menu item in the Dashboard Menu which indicates that you have a new import log message. â To open an import log message, follow these steps: 1 Click the Messages menu item in the Dashboard Menu. 2 In the Messages Window, click an unread message. The Log Message Window opens and a summary of the import is displayed in a text box. 3 Click Download Detailed Import Log at the bottom of the to download a complete log of the import. Log Message Window NOTE: All import log messages are removed from the Messages Window when the Acrolinx core server is restarted. However, the detailed log files are still stored in the server output directory. For more details ask your Acrolinx Server administrator. Creating Empty Reuse Repositories Creating an empty repository is helpful if: • • you prefer to have your repositories ready before adding harvested sentences. you want to manually add clusters and sentences (see "Creating New Clusters" on page 21) to the repository. 15 â To create a new reuse repository, follow these steps: 1 Open the page Reuse > Repositories. 2 Click New. 3 Enter a Name, select the Language and click OK. Canceling a Repository Task You might want to cancel a repository which is being created or updated under the following circumstances: • • The task is taking too long, or requires a large amount of server resources. You have used the wrong sentence bank or import file, and want to create or update the repository again. â To cancel a repository: • Select the relevant repositories and click Cancel Selected. The selected repositories display as canceled in the Remaining Time column. NOTE: It can take a few seconds before the status information is updated. 16 Managing Clusters Chapter 3 Managing Clusters In a reuse repository, clusters are used to group sentences together based on structure and meaning. A cluster normally contains several sentences. However, if there are no other similar sentences in the repository, a cluster can also contain a single sentence At least one sentence in a cluster is selected to be a representative sentence. The representative is the preferred wording of the sentence when several variations of the sentence exist. The Clusters Page The Clusters Page displays the clusters within a reuse repository. Figure 2: The Clusters Page The clusters page has the following parts. Part Use to Keyword Filter Apply a filter based on keywords in the cluster (see "Sorting and Filtering the Cluster List" on page 19). New Cluster Button Create a new cluster (see "Creating New Clusters" on page 21). Status Filter Apply a search filter based on cluster status (see "Sorting and Filtering the Cluster List" on page 19). 17 Part Use to Cluster Size Filter Apply a filter based on cluster size (see "Sorting and Filtering the Cluster List" on page 19). Cluster table columns Sort the cluster list. (see "Sorting and Filtering the Cluster List" on page 19) Edit Cluster button Add sentences to a cluster (see "Adding Sentences to Clusters" on page 20). Create as New Cluster button Create a new cluster from the selected sentences (see "Creating New Clusters" on page 21). Delete button Delete sentences from a cluster (see "Removing Sentences from Clusters" on page 20). Set Representatives button Change the representative sentences for a cluster (see "Changing Representative Sentences" on page 20). The Clusters Table The clusters page shows a list of clusters in a table with the following columns: Column Name Details Active Contains buttons to control the cluster status. • • • If both buttons are gray, the cluster has not yet been validated, and will not be used in a check. If the On button is green, the cluster is active, and the server will flag near matches. If the Off button is red, the cluster is inactive, and will not be used in a check. ID A numeric unique identifier of each cluster Representative Indicates the current cluster name. The cluster name is taken from the first cluster representative. The cluster name can change when users edit the cluster representatives (see "Changing Representative Sentences" on page 20). 18 Managing Clusters Column Name Details Matches Indicates how often sentences in each cluster where offered as suggestions in the Acrolinx Plug-ins. Last Detected Indicates when the sentence was last offered as a suggestion. Size Indicates how many sentences are in a cluster. Version The version number of the cluster and sentences. The version number of a cluster increases when you use the clustering wizard to merge new sentences into an existing cluster. Newer sentences in a cluster have higher version numbers. The cluster inherits the version number of the newest sentence. Representative Sentences A representative sentence is the preferred sentence within a cluster. The Acrolinx plug-ins displays the representative sentence as a suggested replacement if a variation is found. During clustering process the Acrolinx Server selects the first sentence in the cluster with an Acrolinx score of zero to be representative sentence. Acrolinx score ranks a sentence on how closely the sentence adheres to Acrolinx style and grammar standards. If a cluster does not contain any sentences with an Acrolinx score of zero, the cluster is not added to the The repository. After you have created a repository, you can change the representative sentence or select additional representative sentences. You might choose more than one representative sentence if the sentences can be used in different ways depending on the context. Editing Clusters You can edit clusters to change the way sentences are grouped and to select new representative sentences. After you review and edit your clusters, you can change the status of individual clusters to enable or disable them for checking. 19 Sorting and Filtering the Cluster List Column Sorting You can sort columns with bold headers in ascending or descending order. Column sorting is useful for validating large lists of clusters. For example, you can sort by match frequency to see the most frequently detected sentences within clusters. You can also filter the cluster list based on certain attributes of a cluster, the cluster ID, or keywords within the clustered sentences. â To sort or filter your cluster list, follow these steps: 1 Click a column header. 2 Enter one or more keywords in the search field and click Search. NOTE: Numerals are not recognized in keyword searches. For example, the search "4 fan trays" returns all sentences that contain "fan trays" but not "4 fan trays". 3 Enter a cluster ID in the search field and click Search. 4 In the Minimum Cluster Size field, enter the minimum number of sentences that a cluster must contain in order to appear in the search results. 5 Select a filter checkbox to filter clusters by status. • • • The off. The The Proposed checkbox shows clusters that are not yet turned on or Enabled checkbox shows clusters that are turned on. Disabled checkbox shows clusters that are turned off. Changing the Cluster Status After you activate a cluster and restart the language server, sentences that vary from the representative sentence are flagged when users check their documents. â To change the cluster status: 1 Click the On button to enable the cluster or click the Off button to disable the cluster. 2 Restart the relevant language server to make your changes available for checking. You can also make additional changes before you restart the language server. You cannot delete a cluster because the cluster might be created again when you add new harvested sentences to your repository. When you disable a cluster, the cluster is not used for checking. A disabled cluster is also never re-created. 20 Managing Clusters Changing Representative Sentences By default, Acrolinx automatically selects a representative sentence when you create a repository. However, you can change the representative sentence and select additional representative sentences. â To edit representative sentences, follow these steps: 1 2 3 4 Click the cluster name to see the sentences in the cluster. Select or deselect the checkboxes next the relevant sentences. Click the Set Representatives button. Restart the relevant language server to make your changes available for checking. You can also make additional changes before you restart the language server. Removing Sentences from Clusters You can remove a sentence from a cluster if the sentence is not relevant to the cluster. If you notice many clusters that contain irrelevant sentences, you might need to adjust your cluster settings (see page 11) and create the repository again. Although the sentences are deleted from the cluster, they are still kept in the repository to ensure that they are not clustered again. Deleted sentences are moved to a new cluster with the status 'Disabled'. â To remove sentences from a cluster, follow these steps: 1 Click the cluster name to see the sentences in the cluster. 2 Select the checkboxes next the relevant sentences. 3 Click Delete. After you click Delete, the removed sentence is moved to its own cluster that has the status disabled. 4 Restart the relevant language server to make your changes available for checking. You can also make additional changes before you restart the language server. Adding Sentences to Clusters You can add additional sentences to a cluster by entering a new sentence or moving an existing sentence from other cluster. To search clusters, ensure that a language server in the search language is running. For example, to search Japanese clusters, ensure that a language server configured with Japanese is running. â To add a new sentence to a cluster, follow these steps. 1 In the cluster list, expand the cluster that you want to edit and click the Edit Cluster button. 2 Add a sentence: 21 • To add a new sentence: a Enter a new sentence in the Enter new sentence field. b Click Add Sentence. • To move an existing sentence: a Enter a sentence or set of keywords and click Search Clusters. b Select the sentences that you want to add to the cluster and click Move Selected Sentences. The selected sentences are removed from the clusters in the Clusters with Similar Sentences section and added to the Target Cluster. 3 Restart the relevant language server to make your changes available for checking. You can also make additional changes before you restart the language server. Creating New Clusters You can manually create a cluster to contain new sentences that are not in your repository. For example, you have a small document that contains variations of the sentence "Install gateways and switch cards". The sentences in this document were not harvested, but you need a quick way of adding the sentences to your repository. You can also create a new cluster if an existing cluster is too big and needs to be split into two smaller clusters. If you want your new clusters to be available for checking, ensure that you enable the clusters. â To create a cluster that contains new sentences, follow these steps. 1 In a repository, click the New Cluster button at the top of the cluster list. The cluster has the placeholder name "Missing Representative" until you add sentences to the cluster. 2 Add sentences to the cluster (see "Adding Sentences to Clusters" on page 20). 3 Restart the relevant language server to make your changes available for checking. You can also make additional changes before you restart the language server. â To create a new cluster from existing sentences, follow these steps: 1 In the cluster list, expand the cluster which contains the sentences that will go into the new cluster. 2 Select the desired sentences. 3 Click Create as New Cluster. The Edit Cluster page opens for the new cluster. 4 Restart the relevant language server to make your changes available for checking. 22 Managing Clusters You can also make additional changes before you restart the language server. 23 Chapter 4 Managing Reuse Repositories The Repositories Page displays a list of reuse repositories. A Reuse Repository is used to store sentences which are grouped into clusters (see "Managing Clusters" on page 16). The Repositories table has the following columns: Column Name Details Language The language of the repository. A repository can contain sentences in one language only. Repository The name of the repository. You enter the name when you create a new repository. Active In The rule set which contains the repository when used for checking. Clusters The number of clusters in the repository. The number of clusters that you can manage depends on your available hardware resources. Sentences The number of sentences in the repository. The number of sentences that you can manage depends on your available hardware resources. Matches Indicates how often sentences in the repository were offered as suggestions in the Acrolinx Plug-ins. You use statistics to prioritize clusters for editing. Version The version number of the repository. The version number of the repository changes when you use the clustering wizard to update or replace an existing repository. Enabling Reuse Repositories for Checking A repository is not enabled for checking until you assign the repository to a rule set. You assign new reuse repositories to rule sets on the Reuse Repositories page in the Resources section. You can also deactivate or activate existing assignments. A reuse repository configuration page is available for each of the languages configured in your resources. 24 Managing Reuse Repositories Assigning a Reuse Repository to a Rule Set To enable a repository for checking, you must assign the repository to a rule set. You can assign the same repository to one or more rule sets, but you can assign only one repository to each rule set. â To assign a reuse repository to a rule set, follow these steps: 1 Navigate to Resources > Reuse Repositories. 2 Navigate to the reuse repository configuration page for the relevant language. 3 In the Repository column, select the repository for the rule set that you want to assign repository to. 4 Select the checkbox in the Active column to activate the assignment for checking. 5 Click Save. Activating or Deactivating Repositories You can control whether a reuse repository is loaded by the language servers by using the checkboxes in the Active column. If you want to update the contents of the repository but do not want users to check with a repository that might change, you can deactivate the repository. â To activate or deactivate a repository assignment: • In the relevant rows, select or deselect the checkbox in the Active column to activate or deactivate the assignment between the rule set and the resue repository and click Save. Language Server Statuses and Warnings The table on the Reuse Repositories Configuration page contains a Status column which displays the language server status of each reuse repository. The language server status indicates the availability of the reuse repository to plug-in users. Language Server Statuses The following table describes the possible loading statuses. Status Description Loaded The language server loaded the reuse repository and the reuse repository is available to plug-in users. Loading The language server is in the process of loading the reuse repository. Not loaded The reuse repository was loaded by the language server and is not available to plug-in users. Changes not loaded The reuse repository was edited on the reuse repository configuration 25 Status Description page but the changes are not yet loaded by the language server. Language configuration unavailable The reuse repository was created for a language which is no longer configured. This status is usually displayed when the server cannot locate the language configuration file configuration.properties in either one of the following directories: • • <INSTALL_DIR>\data\<LANG_ID>\ %ACROLINX_CONFIGURATION_ROOT%\data\<LANG_ID>\ If the language configuration file does not exist, copy your backup of this file to the following location: %ACROLINX_CONFIGURATION_ROOT%\data\<LANG_ID>\ If no backup copy exists, reinstall your linguistic resources. Language server unavailable A language server is configured with the language that is required by this repository. However, the language server is not running. Start the language server for the language that you are working with. Reuse Repository Warnings If the directory which stores the reuse repository has been deleted from your resources, the name of the reuse repository appears with a red border on the Reuse Repository Configuration page. Figure 3: A Reuse Repository Warning After you select another valid repository and save your changes, the red border is removed and the missing repository is removed from the repositories dropdown. Exporting a Reuse Repository You can use the Export Repository feature to export a list of clustered sentences to a text file for users to review offline. After you have reviewed the export file, you might want to create a revised list of representative sentences which you can import to a new repository. 26 Managing Reuse Repositories â To export a reuse repository, follow these steps: 1 Open the Clusters Page for your repository. 2 Click Export Repository. The Export Reuse Repository dialog box appears. 3 Select one of the following options: • Include repository summary to include the repository summary at the beginning of the file. The repository summary contains information about when the export was created and the number and size of clusters in the repository. • Include clusters summary to include a cluster summary above each cluster. The cluster summary contains information about the number of sentences and representatives in each cluster. • Include representatives only to export only the representative sentences from each cluster. 4 Click OK. 5 Right click the Download Link that appears, and click Save Target As. Checking for Reuse Issues in the Acrolinx Plug-ins To check for reuse issues in the Acrolinx Plug-ins: • • select a rule set that has a reuse repository assigned. select the Reuse option. Deleting a Reuse Repository You can permanently delete repositories which are not being updated or are active in a rule set. The Active In column on the Repositories page displays the value 'n/a' for repositories which are not active in any rule set. Repositories with this value can usually be deleted unless they are also being updated. If the repository you want to delete is active in one or more rule sets, remove all associations to the repository on the Reuse Repository Configuration page (see "Assigning a Reuse Repository to a Rule Set" on page 24) before commencing this procedure. â To delete a reuse repository, follow these steps: 1 On the Repositories page, select the repositories to delete. 2 Click Delete. The selected repositories are marked for deletion on the Repositories page. Repositories that are marked for deletion appear in strike-through 27 formatting. Deleted repositories are removed from the list after the core server is restarted. Backing up Reuse Repositories You should back up your repositories in case your installation is corrupted or lost. Reuse repositories are stored in the directory <INSTALL_DIR>\data\reuse\<LANGUAGE_ID>\<REPOSITORY_NAME>. Example: C:\Program_Files\acrolinx\acrolinx\data\reuse\EN\Topspin â To back up all reuse repositories and sentence banks, follow these steps: 1 Stop the core server. 2 Make a copy of the directory <INSTALL_DIR>\data\reuse\. 3 Restart the core server. 28 Index Index A R Acrolinx Reuse, overview • 4 harvesting sentences with Acrolinx Batch Checker • 9 with Acrolinx Plug-in • 9 reuse prerequisites • 6 process overview • 4 repositories • 7 using • 26 reuse repositories activating • 24 adding harvested sentences • 10 assigning • 24 backup • 27 cancelling • 15 clusters • 11 creating • 7 creating from import files • 13 deactivating • 24 deleting • 26 empty • 14 enabling • 23 exporting • 25 harvesting sentences • 7 importing sentences • 13 interface • 23 managing • 23 updating • 7 rule sets • 24 I S importing import log messages • 14 text or TMX files • 13 sentence banks compatability • 9 contents • 8 creating • 8 default • 8 legacy sentence banks • 9 overview • 8 sentences, identifying • 7 C clusters adding sentences • 20 editing • 18 initial status • 13 interface • 16 list view • 19 managing • 16 minimum cluster size • 11 minimum word count • 11 new • 21 removing sentences • 20 representative sentences • 18, 20 status • 19 strictness • 12 H L language servers statuses • 24 warnings • 24
© Copyright 2026 Paperzz