Abstract
So-called standard form contracts, i.e. contracts that are drafted unilaterally by one party, like terms and conditions of online shops or terms of services of social networks, are cornerstones of our modern economy. Their processing is, therefore, of significant practical value. Often, the sheer size of these contracts allows the drafting party to hide unfavourable terms from the other party. In this paper, we compare different approaches for automatically classifying the topics of clauses in standard form contracts, based on a data-set of more than 6,000 clauses from more than 170 contracts, which we collected from German and English online shops and annotated based on a taxonomy of clause topics, that we developed together with legal experts. We will show that, in our comparison of seven approaches, from simple keyword matching to transformer language models, BERT performed best with an F1-score of up to 0.91, however much simpler and computationally cheaper models like logistic regression also achieved similarly good results of up to 0.87.
Original language | English |
---|---|
Title of host publication | Proceedings of The Fifth Workshop on e-Commerce and NLP (ECNLP 5) |
Editors | Shervin Malmasi, Oleg Rokhlenko, Nicola Ueffing, Ido Guy, Eugene Agichtein, Surya Kallumadi |
Place of Publication | Dublin, Ireland |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 199-209 |
Number of pages | 11 |
ISBN (Electronic) | 978-1-955917-35-3 |
Publication status | Published - 1 May 2022 |
Event | The 5th Workshop on e-Commerce and NLP, ECNLP 2022 - Dublin, Ireland Duration: 26 May 2022 → 26 May 2022 |
Workshop
Workshop | The 5th Workshop on e-Commerce and NLP, ECNLP 2022 |
---|---|
Abbreviated title | ECNLP 2022 |
Country/Territory | Ireland |
City | Dublin |
Period | 26/05/22 → 26/05/22 |