Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/33881

Title: Author Identification from Literary Articles with Visual Features: A Case Study with Bangla Documents
Authors: Dhar, Ankita
Mukherjee, Himadri
Sen, Shibaprasad
Sk, Md Obaidullah
Biswas, Amitabha
Teresa, Gonçalves
Roy, Kaushik
Keywords: author identification
statistical-based features
image-based features
deep learning
CNN
Issue Date: 2022
Publisher: MDPI
Citation: Dhar A, Mukherjee H, Sen S, Sk MO, Biswas A, Gonçalves T, Roy K. Author Identification from Literary Articles with Visual Features: A Case Study with Bangla Documents. Future Internet. 2022; 14(10):272. https://doi.org/10.3390/fi14100272
Abstract: Author identification is an important aspect of literary analysis, studied in natural language processing (NLP). It aids identify the most probable author of articles, news texts or social media comments and tweets, for example. It can be applied to other domains such as criminal and civil cases, cybersecurity, forensics, identification of plagiarizer, and many more. An automated system in this context can thus be very beneficial for society. In this paper, we propose a convolutional neural network (CNN)-based author identification system from literary articles. This system uses visual features along with a five-layer convolutional neural network for the identification of authors. The prime motivation behind this approach was the feasibility to identify distinct writing styles through a visualization of the writing patterns. Experiments were performed on 1200 articles from 50 authors achieving a maximum accuracy of 93.58%. Furthermore, to see how the system performed on different volumes of data, the experiments were performed on partitions of the dataset. The system outperformed standard handcrafted feature-based techniques as well as established works on publicly available datasets.
URI: http://hdl.handle.net/10174/33881
Type: article
Appears in Collections:INF - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica

Files in This Item:

File Description SizeFormat
futureinternet-14-00272-v4.pdf1.62 MBAdobe PDFView/Open
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Dspace Dspace
DSpace Software, version 1.6.2 Copyright © 2002-2008 MIT and Hewlett-Packard - Feedback
UEvora B-On Curriculum DeGois