Tucano: a series of decoder-transformers natively pre-trained in Portuguese
To stimulate the future of open development of neural text generation in Portuguese, we present both GigaVerbo, a concatenation of deduplicated Portuguese text corpora amounting to 200 billion tokens, and Tucano, a series of decoder-transformers natively pre-trained in Portuguese.




