The first part provides an introduction to basic procedures for handling and operating with text strings. Then, it reviews major mathematical modeling approaches. Statistical and geometrical models are also described along with main dimensionality reduction methods. Finally, it presents some specific applications such as document clustering, classification, search and terminology extraction.
All descriptions presented are supported with practical examples that are fully reproducible. Further reading, as well as additional exercises and projects, are proposed at the end of each chapter for those readers interested in conducting further experimentation.