Results for "fill-mask"

100 matches found.

google-bert

google-bert/bert-base-uncased

BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the ...

🎭 fill-mask 65,190,605
FacebookAI

FacebookAI/roberta-large

RoBERTa is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on t...

🎭 fill-mask 22,418,408
FacebookAI

FacebookAI/xlm-roberta-base

XLM-RoBERTa is a multilingual version of RoBERTa. It is pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. RoBERTa ...

🎭 fill-mask 19,256,753
FacebookAI

FacebookAI/roberta-base

RoBERTa is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on t...

🎭 fill-mask 10,346,496
distilbert

distilbert/distilbert-base-uncased

DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a self-supervised fashion, usin...

🎭 fill-mask 8,121,555
FacebookAI

FacebookAI/xlm-roberta-large

XLM-RoBERTa is a multilingual version of RoBERTa. It is pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. RoBERTa ...

🎭 fill-mask 6,411,244
google-bert

google-bert/bert-base-multilingual-uncased

BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means it was pretrained on...

🎭 fill-mask 4,820,395
google-bert

google-bert/bert-base-multilingual-cased

BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means it was pretrained on...

🎭 fill-mask 4,032,765
google-bert

google-bert/bert-base-cased

BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the ...

🎭 fill-mask 3,169,807
emilyalsentzer

emilyalsentzer/Bio_ClinicalBERT

- fill-mask...

🎭 fill-mask 2,532,882
microsoft

microsoft/deberta-v3-base

No description available.

🎭 fill-mask 1,874,397
distilbert

distilbert/distilroberta-base

No description available.

🎭 fill-mask 1,524,509
facebook

facebook/esm2_t33_650M_UR50D

No description available.

🎭 fill-mask 1,506,273
microsoft

microsoft/deberta-base

No description available.

🎭 fill-mask 1,305,906
thomas-sounack

thomas-sounack/BioClinical-ModernBERT-base

No description available.

🎭 fill-mask 1,102,378
google-bert

google-bert/bert-base-chinese

No description available.

🎭 fill-mask 1,097,261
distilbert

distilbert/distilbert-base-multilingual-cased

No description available.

🎭 fill-mask 1,050,711
answerdotai

answerdotai/ModernBERT-base

No description available.

🎭 fill-mask 1,015,444
dccuchile

dccuchile/bert-base-spanish-wwm-uncased

No description available.

🎭 fill-mask 974,119
kakaobank

kakaobank/kf-deberta-base

KF-DeBERTaλŠ” λ²”μš© 도메인 λ§λ­‰μΉ˜μ™€ 금육 도메인 λ§λ­‰μΉ˜λ₯Ό ν•¨κ»˜ ν•™μŠ΅ν•œ μ–Έμ–΄λͺ¨λΈ μž…λ‹ˆλ‹€. λͺ¨λΈ μ•„ν‚€ν…μ³λŠ” DeBERTa-v2λ₯Ό 기반으둜 ν•™μŠ΅ν•˜μ˜€μŠ΅λ‹ˆλ‹€. ELECTRA의 RTDλ₯Ό training objective둜 μ‚¬μš©ν•œ DeBERTa-v3λŠ” 일뢀...

🎭 fill-mask 970,660
microsoft

microsoft/deberta-v3-large

No description available.

🎭 fill-mask 970,194
microsoft

microsoft/mdeberta-v3-base

No description available.

🎭 fill-mask 965,937
almanach

almanach/camembert-base

No description available.

🎭 fill-mask 950,885
microsoft

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract

No description available.

🎭 fill-mask 869,969
neuralmind

neuralmind/bert-base-portuguese-cased

No description available.

🎭 fill-mask 698,788
facebook

facebook/esm2_t6_8M_UR50D

No description available.

🎭 fill-mask 683,933
google-bert

google-bert/bert-large-uncased

BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the ...

🎭 fill-mask 640,055
facebook

facebook/esm2_t12_35M_UR50D

No description available.

🎭 fill-mask 619,506
microsoft

microsoft/graphcodebert-base

GraphCodeBERT is a graph-based pre-trained model based on the Transformer architecture for programming language, which also considers data-f...

🎭 fill-mask 576,243
neuralmind

neuralmind/bert-large-portuguese-cased

No description available.

🎭 fill-mask 574,883
facebook

facebook/esm2_t36_3B_UR50D

No description available.

🎭 fill-mask 573,031
aubmindlab

aubmindlab/bert-base-arabertv02

No description available.

🎭 fill-mask 550,090
tohoku-nlp

tohoku-nlp/bert-base-japanese-whole-word-masking

No description available.

🎭 fill-mask 503,266
julien-c

julien-c/dummy-unknown

No description available.

🎭 fill-mask 480,788
google-bert

google-bert/bert-base-german-cased

No description available.

🎭 fill-mask 473,763
nlpaueb

nlpaueb/bert-base-greek-uncased-v1

No description available.

🎭 fill-mask 450,774
emilyalsentzer

emilyalsentzer/Bio_Discharge_Summary_BERT

No description available.

🎭 fill-mask 447,101
albert

albert/albert-base-v2

ALBERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on th...

🎭 fill-mask 442,201
microsoft

microsoft/deberta-v3-small

No description available.

🎭 fill-mask 429,168
beomi

beomi/kcbert-base

No description available.

🎭 fill-mask 387,661
vinai

vinai/bertweet-base

No description available.

🎭 fill-mask 367,673
distilbert

distilbert/distilbert-base-german-cased

No description available.

🎭 fill-mask 306,512
vinai

vinai/phobert-base

Pre-trained PhoBERT models are the state-of-the-art language models for Vietnamese (Pho, i.e. "Phở", is a popular food in Vietnam):...

🎭 fill-mask 304,462
jhu-clsp

jhu-clsp/mmBERT-base

mmBERT represents the first significant advancement over XLM-R for massively multilingual encoder models. Key features include: 1. Massive L...

🎭 fill-mask 274,247
nlpaueb

nlpaueb/legal-bert-base-uncased

No description available.

🎭 fill-mask 260,728
yikuan8

yikuan8/Clinical-Longformer

- longformer - clinical...

🎭 fill-mask 226,894
DeepChem

DeepChem/ChemBERTa-77M-MLM

No description available....

🎭 fill-mask 225,804
answerdotai

answerdotai/ModernBERT-large

No description available.

🎭 fill-mask 217,358
tohoku-nlp

tohoku-nlp/bert-base-japanese

No description available.

🎭 fill-mask 212,798
Shushant

Shushant/nepaliBERT

Pretraining done on bert base architecture....

🎭 fill-mask 212,547
microsoft

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext

No description available.

🎭 fill-mask 206,706
vinai

vinai/phobert-base-v2

No description available.

🎭 fill-mask 189,252
hfl

hfl/chinese-roberta-wwm-ext

No description available.

🎭 fill-mask 178,797
google-bert

google-bert/bert-large-cased

BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the ...

🎭 fill-mask 170,940
microsoft

microsoft/deberta-v2-xlarge

No description available.

🎭 fill-mask 165,043
distilbert

distilbert/distilbert-base-cased

DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a self-supervised fashion, usin...

🎭 fill-mask 160,734
seyonec

seyonec/ChemBERTa-zinc-base-v1

No description available.

🎭 fill-mask 149,193
Rostlab

Rostlab/prot_bert_bfd

ProtBert-BFD is based on Bert model which pretrained on a large corpus of protein sequences in a self-supervised fashion. This means it was ...

🎭 fill-mask 148,958
Rostlab

Rostlab/prot_bert

ProtBert is based on Bert model which pretrained on a large corpus of protein sequences in a self-supervised fashion. This means it was pret...

🎭 fill-mask 147,907
UBC-NLP

UBC-NLP/MARBERTv2

MARBERTv2 is one of three models described in our ACL 2021 paper "ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic". We find tha...

🎭 fill-mask 144,650
Derify

Derify/ChemBERTa_augmented_pubchem_13m

- Derify/augmentedcanonicalpubchem13m - rocauc - rmse libraryname: transformers - ChemBERTa - cheminformatics pipelinetag: fill-mask - name:...

🎭 fill-mask 134,988
microsoft

microsoft/mpnet-base

No description available....

🎭 fill-mask 131,792
microsoft

microsoft/infoxlm-large

InfoXLM (NAACL 2021, paper, repo, model) InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training....

🎭 fill-mask 130,109
facebook

facebook/esm1v_t33_650M_UR90S_1

No description available....

🎭 fill-mask 122,351
neulab

neulab/codebert-python

This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batchsize=32`) on Python code from the `codeparrot/github-...

🎭 fill-mask 121,979
FacebookAI

FacebookAI/xlm-mlm-en-2048

The XLM model was proposed in Cross-lingual Language Model Pretraining by Guillaume Lample and Alexis Conneau. It’s a transformer pretrained...

🎭 fill-mask 113,313
albert

albert/albert-base-v1

ALBERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on th...

🎭 fill-mask 112,504
kuleshov-group

kuleshov-group/mdlm-owt

The model, which has a context length of `1024` and is similar in size to GPT2-medium with approximately `130 million` non-embedding paramet...

🎭 fill-mask 112,034
weiweishi

weiweishi/roc-bert-base-zh

No description available.

🎭 fill-mask 111,580
flaubert

flaubert/flaubert_base_cased

No description available.

🎭 fill-mask 111,552
studio-ousia

studio-ousia/luke-base

No description available.

🎭 fill-mask 110,398
junnyu

junnyu/roformer_chinese_small

https://github.com/ZhuiyiTechnology/roformer...

🎭 fill-mask 108,873
faisalq

faisalq/bert-base-arapoembert

- ar - Arabic BERT - Poetry - Masked Langauge Model...

🎭 fill-mask 106,656
moussaKam

moussaKam/mbarthez

- summarization...

🎭 fill-mask 104,768
deepmind

deepmind/language-perceiver

Perceiver IO is a transformer encoder model that can be applied on any modality (text, images, audio, video, ...). The core idea is to emplo...

🎭 fill-mask 104,460
facebook

facebook/esm2_t30_150M_UR50D

No description available.

🎭 fill-mask 101,037
tohoku-nlp

tohoku-nlp/bert-base-japanese-char

No description available.

🎭 fill-mask 96,265
nreimers

nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large

No description found....

🎭 fill-mask 82,760
novelcore

novelcore/gem-roberta

GEM-RoBERTa HQ Legal is a RoBERTa-base model pre-trained from scratch on a strategically curated 21GB corpus of Greek legal, parliamentary, ...

🎭 fill-mask 69,493
dbmdz

dbmdz/bert-base-italian-xxl-cased

No description available.

🎭 fill-mask 67,213
monologg

monologg/distilkobert

No description available.

🎭 fill-mask 66,192
alabnii

alabnii/jmedroberta-base-sentencepiece

This is a Japanese RoBERTa base model pre-trained on academic articles in medical sciences collected by Japan Science and Technology Agency ...

🎭 fill-mask 66,161
LazarusNLP

LazarusNLP/NusaBERT-base

No description available.

🎭 fill-mask 61,517
facebook

facebook/esm1b_t33_650M_UR50S

No description available.

🎭 fill-mask 61,230
dmis-lab

dmis-lab/biobert-base-cased-v1.2

No description available....

🎭 fill-mask 60,386
facebook

facebook/xlm-roberta-xl

XLM-RoBERTa-XL is a extra large multilingual version of RoBERTa. It is pre-trained on 2.5TB of filtered CommonCrawl data containing 100 lang...

🎭 fill-mask 58,208
kykim

kykim/bert-kor-base

No description available.

🎭 fill-mask 57,591
DeepChem

DeepChem/MoLFormer-c3-1.1B

No description available.

🎭 fill-mask 55,711
nlpaueb

nlpaueb/bert-base-uncased-contracts

No description available.

🎭 fill-mask 55,408
dandelin

dandelin/vilt-b32-mlm

No description available.

🎭 fill-mask 52,047
tohoku-nlp

tohoku-nlp/bert-base-japanese-char-v2

No description available.

🎭 fill-mask 51,492
anferico

anferico/bert-for-patents

- en - masked-lm - pytorch pipeline-tag: "fill-mask" mask-token: "[MASK]" - text: "The present [MASK] provides a torque sensor that is small...

🎭 fill-mask 50,598
PlanTL-GOB-ES

PlanTL-GOB-ES/bsc-bio-ehr-es

Biomedical pretrained language model for Spanish. For more details about the corpus, the pretraining and the evaluation, check the official ...

🎭 fill-mask 48,935
westlake-repl

westlake-repl/SaProt_650M_AF2

No description available.

🎭 fill-mask 48,864
airesearch

airesearch/wangchanberta-base-att-spm-uncased

The architecture of the pretrained model is based on RoBERTa [[Liu et al., 2019]](https://arxiv.org/abs/1907.11692)....

🎭 fill-mask 47,960
medicalai

medicalai/ClinicalBERT

No description available.

🎭 fill-mask 45,342
klue

klue/roberta-large

No description available.

🎭 fill-mask 43,958
aubmindlab

aubmindlab/bert-base-arabertv2

No description available.

🎭 fill-mask 43,823
microsoft

microsoft/BiomedVLP-CXR-BERT-specialized

No description available.

🎭 fill-mask 43,393
huggingface

huggingface/CodeBERTa-small-v1

No description available.

🎭 fill-mask 42,937