This post heavily relies on the book 'Natural Language Processing with Pytorch':https://books.google.co.kr/books?id=AIgxEAAAQBAJ&printsec=copyright&redir_esc=y#v=onepage&q&f=falseCorpusAll NLP tasks begin with text data called a corpus(corpa in plural).A corpus typically includes raw text (in ASCII or UTF-8 format) and metadata associated with this text. While raw text is a sequence of character..