MaChAmp at SemEval-2023 Tasks 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12: On the Effectiveness of Interm

Feedback
Report

17 Views PremiumJun 13, 2023

To improve the ability of language models to handle Natural Language Processing (NLP) tasks and intermediate step of pre-training has recently been introduced. In this setup, one takes a pre-trained language model, trains it on a (set of) NLP dataset(s), and then finetunes it for a target task. It is known that the selection of relevant transfer tasks is important, but recently some work has shown substantial performance gains by doing intermediate training on a very large set of datasets. Most previous work uses generative language models or only focuses on one or a couple of tasks and uses a carefully curated setup. We compare intermediate training with one or many tasks in a setup where the choice of datasets is more arbitrary; we use all SemEval 2023 text-based tasks. We reach performance improvements for most tasks when using intermediate training. Gains are higher when doing intermediate training on single tasks than all tasks if the right transfer task is identified. Dataset smoothing and heterogeneous batching did not lead to robust gains in our setup.

Repost is prohibited without the creator's permission.

0 Follower · 11 Videos

Recommended for You

All
Anime

MaChAmp at SemEval-2022 Tasks 2, 3, 4, 6, 10, 11, and 12: Multi-task Multi-lingual Learning for a Pr

6:32

MaChAmp at SemEval-2022 Tasks 2, 3, 4, 6, 10, 11, and 12: Multi-task Multi-lingual Learning for a Pr

27 Views

From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken L

10:00

From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken L

13 Views

Frustratingly Easy Performance Improvements for Low-resource Setups: A Tale on BERT and Segment Embe

1:55

Frustratingly Easy Performance Improvements for Low-resource Setups: A Tale on BERT and Segment Embe

19 Views

Where are we Still Split on Tokenization?

4:46

Where are we Still Split on Tokenization?

5 Views

Increasing Robustness for Cross-domain Dialogue Act Classification on Social Media Data

5:45

Increasing Robustness for Cross-domain Dialogue Act Classification on Social Media Data

26 Views

Lexical Normalization for Code-switched Data and its Effect on POS Tagging

12:15

Lexical Normalization for Code-switched Data and its Effect on POS Tagging

8 Views

Enough is Enough! A Case Study on the Effect of Data Size for Evaluation Using Universal Dependencie

4:31

Enough is Enough! A Case Study on the Effect of Data Size for Evaluation Using Universal Dependencie

9 Views

We Need to Talk About train-dev-test Splits

8:00

We Need to Talk About train-dev-test Splits

19 Views

Much Gracias: Semi-supervised Code-switch Detection for Spanish-English: How far can we get? full

6:03

Much Gracias: Semi-supervised Code-switch Detection for Spanish-English: How far can we get? full

11 Views

Much Gracias: Semi-supervised Code-switch Detection for Spanish-English: How far can we get?(teaser)

0:39

Much Gracias: Semi-supervised Code-switch Detection for Spanish-English: How far can we get?(teaser)

9 Views

When the Dragon Boat Festival coincides with the college entrance examination, college students are

1:00

When the Dragon Boat Festival coincides with the college entrance examination, college students are

0 View

This 100 million is really embarrassing. Xiao Zhan's new song "Light Spot" has ranked first in sales

3:58

This 100 million is really embarrassing. Xiao Zhan's new song "Light Spot" has ranked first in sales

1 View

The secret of Mahjong maglev revealed!

1:01

The secret of Mahjong maglev revealed!

0 View

My Dearest Nemesis-S1E5 (2025) الترجمة العربية

1:04:16

My Dearest Nemesis-S1E5 (2025) الترجمة العربية

bigbeard movie tv

1 View

The lovely granddaughter brought back a gift for her grandfather from Vietnam

0:39

The lovely granddaughter brought back a gift for her grandfather from Vietnam

0 View

Randomly relieve a beautiful woman's appearance anxiety...

0:35

Randomly relieve a beautiful woman's appearance anxiety...

0 View

Here are some tips to solve the clip sound

0:42

Here are some tips to solve the clip sound

0 View

This router must be recommended to your wife or girlfriend

0:42

This router must be recommended to your wife or girlfriend

0 View

The secrets of the delicious and easy-to-mix convenience store virginity-losing wine revealed!

1:29

The secrets of the delicious and easy-to-mix convenience store virginity-losing wine revealed!

gengongkanzhineng

0 View

The Trauma Code- Heroes on Call-S1E8 (2025) الترجمة العربية

55:04

The Trauma Code- Heroes on Call-S1E8 (2025) الترجمة العربية

bigbeard movie tv

3 Views