Automatic Content Analysis of Legislative Documents by Text Mining Techniques

Automatic Content Analysis of Legislative Documents by Text Mining Techniques The Parliamentary Library of Taiwan’s Legislative Yuan website provides a fair and objective channel for the public to track daily activities of the Legislative Yuan and legislators’ inquiries. However the quantity of generated documents is so large that the general public may not be able to keep track of the legislative performance of each legislator from these contents. To mitigate the gap of legislative document generation and the sense making by the general public, this study proposed a text mining mechanism to automatically classify legislative documents referring to each legislator, and then represent the proportion of their legislative performance on certain categories. This study first initiated a basic legislative categorical structure by domain experts. Then a two-stage clustering was applied to perform feature selection for legislative documents. The SVM method was applied to build a model to classify the new document to the appropriate category. In order to maintain the classification categories up to date, in this study, we also evaluate the difference between labeling contents by domain experts and the general public. Experimental results show the effectiveness of the proposed test mining mechanism, which automatically classifies legislative documents to reveal legislators’ performance accordingly. With this result, people can monitor legislators and track their legislative activities using the information from the Parliamentary Library of Legislative Yuan to update their perception on legislative performance in various categories.