Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Go to Engines.

  2. Click Create new.

  3. Specify the name, languages and the group the new engine will belong to. Unlike corpora, engines must only belong to one group only.

  4. Choose an engine type. The default is Domain-adapted, but for language combinations where Globalese offers stock engines, you can choose Stock+ as well.

  5. If you selected select Domain-adapted, the option to leverage stock corpora may will also be visible . This also depends on the language combination where if Globalese provides stock corpora for the selected language combination.

  6. Select the corpora you want to include in the engine.

  7. For domain-adapted engines, try to collect at least 100,000 segment pairs of relevant master corpora and at least 1 million segment pairs in total for the engine.
    If you use stock corpora, there is no minimum threshold for the total number of segment pairs.

  8. For stock+ engines, there is no minimum corpus volume requirement

    See required corpus volumes here.

  9. Click Save.

Info

Master corpora

Master corpora are the core of the engine. Globalese will use master corpora as a reference when training the engine. The training process will use segment pairs from the auxiliary and/or stock corpora that are from the same domain as the master corpus with a higher weight, and others with a lower weight.

Auxiliary corpora

Auxiliary corpora, just like stock corpora, will be used to enrich the master corpora. A bigger pool of auxiliary corpora means a bigger selection base for the training process.
Only the content most closely related to the master corpora will eventually be used for training the engine, so feel free to add any material that has good linguistic value.