Quantcast
Channel: SIL Language Software Community - Latest topics
Viewing all articles
Browse latest Browse all 648

Optimal lexical model wordlist requirements

$
0
0

Dear Sirs,

I see that it takes several seconds delay to load my 350k lexical model in order to show anything in the box of predicted words upon keyboard launch. At the same time SIL Euro Latin English language model has much less delay.

  1. Please publish and inform link to the source wordlist tsv file of SIL Euro Latin English.tsv in order to see how optimal wordlist should look like. I have not found it here : github . com/keymanapp/lexical-models/tree/master/release/sil

  2. Please inform requirements of the optimal lexical model wordlist, why should it or should not contain :
    a) abbreviations like NBA, CIA, FBI, …
    b) first and last names like Joe, Biden, …
    c) trade marks like Pepsi, Ford, …
    d) cities names like London, Paris, …
    e) capitalized words or all must be only lowercase
    f) size of optimal wordlist like SIL English.tsv

Knowing such requirements will allow users to cut their wordlists for optimal performance and quick loading.

Thanks

2 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 648

Trending Articles