Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

To create an engine:

  1. Go to Engines.

  2. Click Create new.

  3. Specify the name, languages and the group the new engine will belong to. Unlike corpora, engines must only belong to one group only.

  4. Choose an engine type. The default is Domain-adapted, but for language combinations where Globalese offers stock engines, you can choose Stock+ as well.

  5. If you selected Domain-adapted, the option to leverage stock corpora may be visible. This also depends on the language combination where Globalese provides stock corpora.

  6. Select the corpora you want to include in the engine.

    1. For domain-adapted engines, try to collect at least 100,000 segment pairs of relevant master corpora and at least 1 million segment pairs in total for the engine.
      If you use stock corpora, there is no minimum threshold for the total number of segment pairs.

    2. For stock+ engines, there is no minimum corpus volume requirement.

  7. Click Save.

Master corpora

Master corpora are the core of the engine. Globalese will use master corpora as a reference when training the engine. The training process will use segment pairs from the auxiliary and/or stock corpora that are from the same domain as the master corpus with a higher weight, and others with a lower weight.

Auxiliary corpora

Auxiliary corpora, just like stock corpora, will be used to enrich the master corpora. A bigger pool of auxiliary corpora means a bigger selection base for the training process.
Only the content most closely related to the master corpora will eventually be used for training the engine, so feel free to add any material that has good linguistic value.

  • No labels