Abstract:
The Ocu language is among the dialects utilized by the residents of Kampar
Regency for communication. The indigenous community of Kampar has a
responsibility to preserve the sustainability of the Ocu language as an effort to
maintain the identity and distinctive characteristics of the Kampar region. Terms
in the Ocu language have many affixes in the arrangement of words, resulting in
change in words and meanings. This research aims to assess the process and
accuracy level of Ocu Kampar Malay stemming. This study used Levenshtein
algorithm and morphological analysis, which is created by applying UML
diagrams, PHP programming language and MySQL database. Based on the results
of tests that have been carried out on Ocu language document files consisting of
2.262 words, obtained an accuracy rate of 71.37%. The majority of errors in testing
are caused by root words not found in the dictionary and overstemming